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Description 

BACKGROUND OF THE INVENTION 
1 . Field of the Invention 



[0001] The present invention re.ates to novel polynucleotides derived from '» Bn ^ n ^^^^^^ 
Lcteria and fragments thereof, polypeptides encoded by the po.ynucleotides and ^^^^JSSS^ 
arrays comprising the polynucieotides and fragments thereof, computer readable "^"JJ**^ Tel fas a method 
otide sequences of the polynucleotide and fragments thereof have been recorded, and use of them as well as a method 
of using the polynucleotide and/or polypeptide sequence information to make compansons. 

2. Brief Description of the Background Art 

[0002] Coryneform bacteria are used in producing various usefu. substances ^ « 

vitamins, saccharides (for example, ribulose). organic acids (for example. pyruv.c ac.d). and analogues of 

described substances (for examp.e, N-acetylamino acids) and are very useful m,crporgan.sms mdustna.ly. Many mu 

tants thereof are known. niniumir* noid-oro- 

[0003] For exampie. Corynebaclerium gtutamicum is a Gram-posilive bacterium idenuned as a 

ducing bacterium, and many amino acids are produced by mutants thereof. For example 1*00. °~ J""*^ 

g.utamicaddwhichisusefulasaseasoningforu^^ 

a^ivefor.ivestockfeeds and the like, and sev^^^^^ 

L-proline, L-glutamine. L-tryptophan. and the like, have been produced in the world (N,kke, Bio yearbook 99, publishea 

£ooT^ 

mutants) which have a mutated metabolic pathway and regulatory V^^rT^^^^M TT- 
various metabolic regulatory systems so as not to produce more ammo acids than ,t n^. l ^^^ en ^ g 
.ysine. for examp.e.7 microorganism be.onging to the genus Coryneoactenum 'f^^^^sZ^l 
the excessive production by concerted inhibition by lysine and threonine against the activrty of a b °^ e ^ e ^ e . 
common to lysine threonine and methionine, i.e.. an aspartokinase. (J. B,ochem.. 65: 849-859 (1969)). The oosyn 
thes™ ZtfZ h controNed by repressing the expression of its biosynthesis gene » ^» ££££ 
thesize anexcessive amount of arginine (Microbiology. 142: 99-108 (1996)). It is considered that t ^ n J™J s 
reguSo^ mechanisms are deregulated in amino acid-producing mutants. Similar*. ^"^^E^^St 
ulated inmirtants producing nucleic acids, vitamins, saccharides, organic acids and analogues of the above-descnbed 

^r^evTa^ 

s insufficient in comparison with Escherichia coii, Bacillus subtilis, and the like. Also, few findings 

on mutated genes in amino acid-producing mutants. Thus, there are various mechamsms. when are still unknown. 

regulating the growth and metabolism of these microorganisms. known that 

[0006] A chromosomal physical map of Corynebacterium gtutamicum ATCC 13032 is report^ and .Us known_ 
ts genome size is about 3.100 kb (Mo/. Gen. Genet., 252: 255-265 (1996) . Cateu.at.ng. ^<"»£ 
density of bacteria, it is presumed that about 3.000 genes are present .n the genome of abo^l 00 kb 
about 100 genes mainly concerning amino acid biosynthesis genes are known in Coryneoactenum gtutanvcum, and 
the nucleotide sequences of most genes have not been ctanfied hitherto. ei „. h „«- Fseherichla 

[0007] in recemyears. the f u.l nucleotide sequence of the genomes of ^f^^^^^^^X 
coll Mycobacterium tuberculosis, yeast, and the like, have been determined {Science, 277: 1453-62 (1997). ™ 8 ""^ 
3 9 \ sX?W«lZ^%7. 5-1 05 (1 997)). Based on the thus determined full nucleotide sequences, assumption 
^^oT™C**™ of their function by comparison wKh the nucleotide S^SZ 
beii carried out. Thus, the functions of a great number of genes have been presumed, wrthout genet*. biochemical 

or molecular biological experiments. , , „ _ 11( _ K _ r of aenes simulta- 

[0008] In recent years, moreover, techniques for monitoring expression levels of a grea nu ^ J**™"^™ 
noousV or detecting mutations, using DNA chips. DNA arrays or the like in which a ^^J*^^ 
gene or a partial nucleic acid fragment in genomic DNA other than a gene • f,xed tc t °^^;"^ rt ^^ s n 
dev .oped. The techniqu s contribute to the analysis of microorganisms, such as ^; M f^^lS 
Mycobacterium bovis used in BCG vaccines, and the like (Science. 27*. 680-686 (1 997); Proc. Natl. Acad. Sc. USA, 
9&. 12833-38 (1999): Science, 284: 1520-23 (1999)). 
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SUMMARY OF THE INVENTION 

[0009] An object of the present invention is to provide a polynucleotide and a polypeptide d rived from a microor- 
gan.sm of coryneform bacteria which are industrially useful, sequence information of the polynucleotide and th 
polypeptide a method for analyzing the microorganism, an apparatus and a system for use in the analysis and a 
method for breeding the microorganism. 

[0010] The present invention provides a polynucleotide and an oligonucleotide derived from a microorganism be- 
longing to coryneform bacteria, oligonucleotide arrays to which the polynucleotides and the oligonucleotides are fixed 
10 iaSf; P , encod J ed ^ the Polynucleotide, an antibody which recognizes the polypeptide, polypeptide arrays to 
which the polypeptides or the antibodies are fixed, a computer readable recording medium in which the nucleotide 
sequences of the polynucleotide and the oligonucleotide and the amino acid sequence of the polypeptide have been 
recorded, and a system based on the computer using the recording medium as well as a method of using the p lynu- - 
cleotide and/or polypeptide sequence information to make comparisons. 

« BRIEF DESCRIPTION OF THE DRAWING 

i aoVi F ' 9 1 ' S 8 showin9 the P° sitions ° f typical genes on the genome of Corynebacterium glutamicum ATCC 
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[0012] Fig. 2 is electrophoresis showing the results or proteome analyses using proleins derived from (A) Corvne- 
bactenum glutamicum ATCC 13032, (B) FERM BP-7134. and (C) FERM BP-1 SB. 

[0013] Fig. 3 is a flow chart of an example of a system using the computer readable media according to the pres nt 

invention. 

ir?vc l mion Fi9 * 4 iS B fl ° W ° hart ° f Bn GXamp,e ° f a system using the com P"ter readable media according to the present 
DETAILED DESCRIPTION OF THE INVENTION 

io^Tf ^JTl^ »PP ,icati on is based on Japanese applications No. Hei. 11-377484 filed on December 16, 1999 No 

° n Apri ' ? ' 2000 and N ° 2000 ' 2Q °9** ™*d on August 3, 2000, the entire contents of which are 
incorporated hereinto by reference. 

!!?n.? m ?°T the V '^ 0lnt the determination of the full nucleotide sequence of Corynebacterium glutamicum 
would make it possible to specify gene regions which had not been previously identified, to determine the function of 
an unknown gene derived from the microorganism through comparison with nucleotide sequences of known genes 
and amino acid sequences of known genes, and to obtain a useful mutant based on the presumption of the metabolic 
regulatory mechanism of a useful product by the microorganism, the inventors conducted intensive studies and, as a 
result, found that the complete genome sequence of Corynebacterium glutamicum can be determined by applying the 
whole genome shotgun method. x«kkt y 

[0017] Specifically, the present invention relates to the following (1 ) to (65): 

(1 ) A method for at least one of the following: 

(A) identifying a mutation point of a gene derived from a mutant of a coryneform bacterium, 

(B) measuring an expression amount of a gene derived from a coryneform bacterium, 

(C) analyzing an expression profile of a gene derived from a coryneform bacterium, 

(D) analyzing expression patterns of genes derived from a coryneform bacterium, or 

(E) identifying a gene homologous to a gene derived from a coryneform bacterium, 

said method comprising: 

(a) producing a polynucleotide array by adhering to a solid support at least two polynucleotides selected 
from the group consisting of first polynucleotides comprising the nucleotide sequence represented by any 
one of SEQ ID NOS:1 to 3501 , second polynucleotides which hybridize with the first polynucleotides under 
stringent conditions, and third polynucleotides comprising a sequence of 10 to 200 continuous bases of 
the first or second polynucleotides, 

(b) incubating the polynucleotide array with at least one of a labeled polynucleotide derived from a co- 
ryn form bacterium, a lab led polynucleotide derived from a mutant of the coryn form bacterium r a 
lab led polynucleotide to be examined, under hybridization conditions, 

(c) detecting any hybridization, and 

(d) analyzing the result of the hybridization. 
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As used herein for example, the at least two polynucleotid s can be at least two of th first polynu- 
cleotides, at least two of the second polynucleotides, at least two of the third polynucleotides, or at least 
two of the first, second and third polynucleotides. 

5 (2) The method according to (1 ). wherein th coryneform bacterium is a microorganism belonging to the genus 

Corynebactehum, the genus Brevibacterium. or the genus Microbactenum. ...... 

(3) The method according to (2). wherein the microorganism belonging to the genus ^f^^i^^S 
rem the group consisting of Corynebacterium glutamicum, Corynebacterium 

acetogluiamicum. Corynebacterium catlunae, Corynebacterium herculis. Corynebactenum Uhurn, Corynebacten- 
w urn melassecola. Corynebacterium thermoaminogenes. and Corynebacterium ammomagenes. nn|umirp 

(4) The method according to (1 ). wherein the polynucleotide derived from a coryneform bacterium the pory^nuce- 
lotide derived from a mutant of the coryneform bacterium or the polynucleotide to be exam.ned ts a gene relating 
to the biosynthesis of at least one compound selected from an amino acid, a nucleic acid, a v.tam.n. a saccharide, 
an orqanic acid, and analogues thereof. ,-,_■».• 

is (5) The method according to (1 ). wherein the polynucleotide to be examined is derived from Eschenchm coll. 

(6) A polynucleotide array, comprising: 

at least two polynucleotides selected from the group consisting of first polynucleotides comprising the nucle- 
otide sequence represented by any one of SEQ ID NOS:1 to 3501 . second polynucleotides which hybridize 
so with the first polynucleotides under stringent conditions, and third polynucleotides comprising 1 0 to 200 con- 

tinuous bases of the first or second polynucleotides, and 
a solid support adhered thereto. 

As used heroin, for example, the at least two polynucleotides can be at least two of the first polynucleotides 
25 at least two of the second polynucleotides, at least two of the third polynucleotides, or at least two of the first, 
second and third polynucleotides. ^ houi „ 

(7) A polynucleotide comprising the nucleotide sequence represented by SEQ ID NO:1 or a polynucleotide having 
a homology of at least 80% with the polynucleotide. cc ~tn k.^c-o t« iasi or 

(8) A polynucleotide comprising any one of the nucleotide sequences represented by SEQ ID NOS.2 to 3431 . or 
so a polynucleotide which hybridizes with the polynucleotide under stringent conditions. 

(9) A polynucleotide encoding a polypeptide having any one of the amino acid sequences represented by SEQ ID 
NOS-3502 to 6931 . or a polynucleotide which hybridizes therewith under stringent conditions. 

(10) A polynucleotide which is present in the 5' upstream or 3' downstream of a polynucleotide ^"JJ* 
nucleotide sequence of any one of SEQ ID NOS:2 to 3431 in a whole polynucleotide compns ing the «ot.de 

35 sequence represented by SEQ ID NO:1 , and has an activity of regulating an expression of the Polynucleot d£ 

(11) A polynucleotide comprising 10 to 200 continuous bases in the nucleotide sequence of the polymicleot.de of 
any one of (7) to (1 0). or a polynucleotide comprising a nucleotide sequence complementary to the potynucl ot.de 
comprising 1 0 to 200 continuous based. 

(12) A recombinant DNA comprising the polynucleotide of any one of (8) to (11). 

40 (13) A transformant comprising the polynucleotide of any one of (8) to (11) orthe recombinant DNA of (12). 

(14) A method for producing a polypeptide, comprising: 

culturing the transformant of (13) in a medium to produce and accumulate a polypeptide encoded by the 
polynucleotide of (8) or (9) in the medium, and 
45 recovering the polypeptide from the medium. 

(1 5) A method for producing al least one of an amino acid, a nucleic acid, a vitamin, a saccharide, an organic add. 
and analogues thereof, comprising: 

so culturing the transformant of (1 3) in a medium to produce and accumulate at least one of an amino acid, a 

nucleic acid, a vitamin, a saccharide, an organic acid, and analogues thereof in the medium, and 
recovering the at least one of the amino acid, the nucleic acid, the vitamin, the sacchandc, the orgarnc acid. 
^,and analogues thereof from the medium. 

55 (1 6 ) A polypeptide encoded by a polynucleotide comprising the nucleotide sequence selected from SEQ ID NOS: 

2 to 3431 

(1 7) A polypeptide comprising the amino acid sequence selected from SEQ ID NOS:3S02 to 6931 

(18) The polypeptide according to (16) or (17). wher in at least on amino acid is deleted, replaced, .nserted r 
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anemone Tf 9 T** ***** * Substantifl,, y the *™ « that of the polypeptide without sa.d 

at least one amino acid deletion, replacement, insertion or addition 

(19) A polypeptide comprising an amino acid sequence having a homology of at least 60% with the amino acid 
XeSe P0,yP6Ptide ° f (16) " ° 7) - h3Vin9 3n ^ WhiCh ^^antiany 

(20) An antibody which recognizes the polypeptide of any one of (16) to (19) 
• (21) A polypeptide array, comprising: 

JiUfT! P ° lypeptide ° r P3rtial ,ra 9 ment Polypeptide selected from the polypeptides of (16) to (19) and 
partial fragment polypeptides of the polypeptides, and 1 ' ( 1 9) and 

a solid support adhered thereto. 

(22) A polypeptide array, comprising: 

naes or (1 6) to (1 9) and partial fragment polypeptides of the polypeptides and 
a solid support adhered thereto. 

(ii) a data storage device for at least temporarily storing the input information- 

Ts^wTth^* C ° t mpareS the at ,east one ™cle°tide sequence information selected from SEQ ID NOS. 
fo TJSZlZ ! 9 ! Sequence or ter 9 et structure m °tif information, recorded by the data storage d vice 

a^L ,? T* 2 '" 9 nUC,e0tide S6qUenCe informatfo " " hi <* * coincident with or analogous to the 
target sequence or target structure motif information; and 

(rv) an output device that shows a screening or analyzing result obtained by the comparator. 

co^vnlfTm^h^r 8611 °" 8 COmputer for identifying a target sequence or a target structure motif deriv d from a 
coryneform bacterium, compnsing the following: »■ ■» u »um a 

(i) inputting at least one nucleotide sequence information selected from SEQ ID NOS-1 to 3501 taraet se- 
quence .nformafon or target structure motif information into a user input device- ' ^ 

(ii) at least temporarily storing said information; 

!hi Z?ft arin9 thC at ' eaSt ° ne nuc,eotide sequence information selected from SEQ ID NOS:1 to 3501 with 
the target sequence or target structure motif information- and 

JalSs^uSri ! " a,y f 7 nu * eotide se " uence information which is coincident with or analogous to th 
target sequence or target structure motif information. 

co%nJal e Z^ Seti °" 3 C ° mPUter f ° r identifvin S a t£M 9 et sequence or a target structure motif derived from a 
coryneform bacterium, compnsing the following: 

3502?^™/ d H V ! Ce th8t inpUtS 31 l6aSt °" e amino acld set l u ence information selected from SEQ IO NOS- 

3502 to 7001 , and target sequence or target structure motif information- 

(ii) a data storage device for al least temporarily storing the input information 

3502 TTnT^Tl C ° mpares 0,6 at ,east one ^'"o acid sequence information selected from SEQ ID NOS: 
devJ J fZll ^ seouence or tar 9 et s, ~cture motif information, recorded by the data storag 

the f 9 ana '* 2in 9 acid sequence information which is coincident with or analogous to 

the target sequence or target structure motif information; and a.ogous i° 

(rv) an output device that shows a screening or analyzing result obtained by the comparator. 

coSnlfTmTS^^ °" 8 C ° mPUter f ° r identifvin 9 a ta ^ et sequence or a target structure motif derived from a 
coryneform bact num, comprising the following: 

s au^in^r™ T in ° ■?* SeqUenCe in1omfition se,ected from SEQ ID NOS:3502 to 7001 . and target 
s quence information or target structure motif information into a user input d vice; 
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(ii) at least temporarily storing said information; 

(Hi) comparing the at least one amino acid sequence information selected from SEQ ID NOS.3502 to 7001 
with the taraet seauence or target structure motif information; and 

W^eening and' analyzing aLo acid sequence mformation which is coincident with or anaiogous to the 
target sequence or target structure motif information. 

(27) A system based on a computer for determining a function of a polypeptide encoded by a polynucleotide having 
a target nucleotide sequence derived from a coryneform bacterium, comprising the following: 

(i) a user input device that inputs at least one nucleotide sequence information selected from , S IEQ ID i NOS v.2 
to 3501. function information of a polypeptide encoded by the nucleotide sequence, and target nucleot.de 

sequence information; 

(ii) a data storage device for at least temporarily storing the input information; _ 

H) a comparator that compares the at least one nucleotide sequence ^^^^^^^ 
2 to 3501 with the target nucleotide sequence information, and determining a funct.on o a po peptide encoded 
by a polynucleotide having the target nucleotide sequence which is coma, aent wrth r ana logous to the poly 
nucleotide having at least one nucleotide sequence selected from SEQ ID NOS:2 to 3501 . and 
(iv) an output devices that shows a function obtained by the comparator. 

(28) A method based on a computerf or determining a function of a polypeptide encoded by a P°WPeptWe enc ded 
by I polynucleotide having a target nucleotide sequence derived from a coryneform bactenum, comprising the 

following: 

(i) inputting at least ono nucleotide sequence information selected from SEQ ID NOS:2 to 3501 . function | in- 
formation of a polypeptide encoded by the nucleotide sequence, and target nucleotide sequence mformation, 

(ii) at least temporarily storing said information; .„._ Mri e. 9 to asoi with 
(Hi) comparing the at least one nucleotide sequence information selected from SEQ ID NOS.2 to 3501 wrth 
the taraet nucleotide sequence information; and ^.^^ «^ 
(iv) determining a function of a polypeptide encoded by a polynucleotide having ^£2T£S^3L£ 
which is coincident with or analogous to the polynucleotide having at least one nucleot.de sequence selected 
from SEQ ID NOS:2 to 3501 

(29) Asystembasedonacomputerfordeterminingafunctionofapolypeptidehavingatargetaminoacidsequ nc 

derived from a coryneform bacterium, comprising the following: 

(i) a user input device that inputs at least one amino acid sequence information selected from SEQ IC .NOS: 
3502 to 7001 . function information based on the amino acid sequence, and target amino aad sequence ,nfor- 
mation; 

(ii) a data storing device for at least temporarily storing the input .nformation; 

IQ a comparato? that compares the at least one amino acid sequence m man se e*ed I from SEQ M NOS. 
3502 to 7001 with the target amino acid sequence information for determining a funct.on of a polypepwe 
navfng the -target amino acTd sequence which is coincident with or analogous to the polypeptide having at toast 
one amino acid sequence selected from SEQ ID NOS:3502 to 7001 ; and 
(iv) an output device that shows a function obtained by the comparator. 

(30) A method based on a computer for determining a function of a polypeptide having a target amino acid sequence 
derived from a coryneform bacterium, comprising the following: 

(i) inputting at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001 . function 
information based on the amino acid sequence, and target amino acid sequence information; 

(ii) at least temporarily storing said information; wrvs-isoj* to 7001 
(ii!) comparing the at least one amino acid sequence information selected from SEQ ID NOS.3502 to 7001 
with the target amino acid sequence information; and . . . . . . _ r 

(iv) determining a function of a porypeptide having the target amino acid to 
analogous to the polypeptide having at least one amino acid s quenc s lected from SEQ ID NOS.3502 to 
7001. 

(31) Th system according to any one of (23). (25). (27) and (29). wherein a corynef rm bacterium is a micro r- 
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ganism of the genus Corynebactehum, the genus Brevibacterium, or the genus Microbactehum. 

(32) The method according to any one of (24), (26), (28) and (30), wherein a coryneform bacterium is a microor- 
ganism of the genus Corynebactehum, the genus Brevibacterium, or the genus Microbacterium. 

(33) The system according to (31 ), wherein the microorganism belonging to the genus Corynebactehum is selected 
from the group consisting of Corynebactehum glutamicum, Corynebactehum acetoacidophilum, Corynebactehum 
acetogtutamicum, corynebactehum callunae, corynebactehum herculis, Corynebactehum t ilium, Corynebactehum 
melassecola, Corynebactehum thermoaminogenes, and Corynebactehum ammoniagenes. 

(34) The method according to (32) , wherein the microorganism belonging to the genus Corynebactehum is selected 
from the group consisting of Corynebactehum glutamicum, Corynebactehum acetoacidophilum, Corynebactehum 
acetogtutamicum, Corynebactehum callunae f Corynebactehum herculis, Corynebactehum mum, Corynebacteh- 
um melassecola, Corynebactehum thermoaminogenes, and Corynebactehum ammoniagenes. 

(35) A recording medium or storage device which is readable by a computer in which at least one nucleotide 
sequence information selected from SEQ ID NOS:1 to 3501 or function information based on the nucleotide se- 
quence is recorded, and is usable in the system of (23) or (27) or the method of (24) or (28). 

is (36) A recording medium or storage device which is readable by a computer in which at least one amino acid 

sequence information selected from SEQ ID NOS:3502 to 7001 or function information based on the amino acid 
sequence is recorded, and is usable in the system of (25) or (29) or the method of (26) or (30). 

(37) The recording medium or storage device according to 

(35) or (36), which is a computer readable recording medium selected from the group consisting of a floppy disc, 
a hard disc, a magnetic tape : a random access memory (RAM), a read only memory (ROM), a magneto-optic disc 
(MO), CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM and DVD-RW. 

(38) A polypeptide having a homoserine dehydrogenase activity, comprising an amino acid sequence in which the 
Val residue at the 59th in the amino acid sequence of homoserine dehydrogenase derived from a coryneform 
bacterium is replaced with an amino acid residue other than a Val residue. 

(39) A polypeptide comprising an amino acid sequence in which the Val residue at the 59th position in the amino 
acid sequence as represented by SEQ ID NO:6952 is replaced with an amino acid residue other than a Val r sidue. 

(40) The polypeptide according to (38) or (39), wherein the Val residue at the 59th position is replaced with an Ala 
residue. 

(41) A polypeptide having pyruvate carboxylase activity, comprising an amino acid sequence in which the Pro 
residue at the 458th position in the amino acid sequence of pyruvate carboxylase derived from a corynef rm 
bacterium is replaced with an amino acid residue other than a Pro residue. 

(42) A polypeptide comprising an amino acid sequence in which the Pro residue at the 458th position in the amino 
acid sequence represented by SEQ ID NO:4265 is replaced with an amino acid residue other than a Pro residue. 

(43) The polypeptide according to (41 ) or (42), wherein the Pro residue at the 458th position is replaced with a Ser 
35 residue. 

(44) The polypeptide according to any one of (38) to (43), which is derived from Corynebactehum glutamicum. 

(45) A DNA encoding the polypeptide of any one of (38) to (44). 

(46) A recombinant DNA comprising the DNA of (45). 

(47) A transformant comprising the recombinant DNA of (46). 

(48) A transformant comprising in its chromosome the DNA of (45). 

(49) The transformant according to (47) or (48), which is derived from a coryneform bacterium. 

(50) The transformant according to (49), which is derived from Corynebactehum glutamicum. 

(51) A method for producing L-lysine, comprising: 
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culturing the transformant of any one of (47) to (50) in a medium to produce and accumulate L-lysin in the 
medium, and 

recovering the L-lysine from the culture. 

(52) A method for breeding a coryneform bacterium using the nucleotide sequence information repres nted by 
SEQ ID NOS:1 to 3431 , comprising the following: 

(i) comparing a nucleotide sequence of a genome or gene of a production strain derived a coryneform bacte- 
rium which has been subjected to mutation breeding so as to produce at least one compound selected from 
an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof by a ferm ntation 
method, with a corresponding nucl otide sequ nee in SEQ ID NOS:1 to 3431 ; 

(ii) identifying a mutation point present in the production strain based on a result obtained by (i); 

(Hi) introducing the mutati n point into a coryn form bacterium which is free of the mutation point; and 

(iv) examining productivity by the fermentation method of the compound s lected in (0 of the coryneform 
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bacterium obtained in (iii). 

(53) The method according to (52). wherein the gene is a gene encoding an enzyme in a biosynthetic pathway or 
a signal transmission pathway. 

(54) The method according to (52), wherein the mutation point is a mutation point relating to a useful mutation 
which improves or stabilizes the productivity. 

(55) A method for breading a coryneform bacterium using the nucleotide sequence information represented by 
SEQ ID NOS:1 to 3431, comprising: 

(i) comparing a nucleotide sequence of a genome or gene of a production strain derived a coryneform bacte- 
rium which has been subjected to mutation breeding so as to produce at least one compound selected from 
an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof by af ermentation 
method, with a corresponding nucleotide sequence in SEQ ID NOS:1 to 3431 ; 

(ii) identifying a mutation point present in the production strain based on a result obtain by (i); 

(iii) deleting a mutation point from a coryneform bacterium having the mutation point; and 

(iv) examining productivity by the fermentation method of the compound selected in (i) of the coryneform 
bacterium obtained in (iii). 

(56) The method according to (55), wherein the gene is a gene encoding an enzyme in a biosynlhelic pathway or 
a signal transmission pathway. 

(57) The method according to (55), wherein the mutation point is a mutation point which decreases or destabiliz s 
the productivity. 

(58) A method for breeding a coryneform bacterium using the nucleotide sequence information represented by 
SEQ ID NOS:2 to 3431 1 comprising the following: 

(i) identifying an isozyme relating to biosynthesis of at least one compound selected from an amino acid, a 
nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof, based on the nucleotide se- 
quence information represented by SEQ ID NOS:2 to 3431 ; 

(ii) classifying the isozyme identified in (i) into an isozyme having the same activity; 

(iii) mutating all genes encoding the isozyme having the same activity simultaneously; and 

(iv) examining productivity by a fermentation method of the compound selected in (i) of the coryneform bac- 
terium which have been transformed with the gene obtained in (iii). 

(59) A method for breeding a coryneform bacterium using the nucleotide sequence information represented by 
SEQ ID NOS:2 to 3431 , comprising the following: 

(i) arranging a function information of an open reading frame (ORF) represented by SEQ ID NOS:2 to 3431 ; 

(ii) allowing the arranged ORF to correspond to an enzyme on a known biosynthesis or signal transmission 
pathway; . _ . 

(iii) explicating an unknown biosynthesis pathway or signal transmission pathway of a coryneform bactenum 
in combination with information relating known biosynthesis pathway or signal transmission pathway of a co- 
ryneform bacterium; 

(iv) comparing the pathway explicated in (iii) with a biosynthesis pathway of a target useful product; and 

(v) transgenetically varying a coryneform bacterium based on the nucleotide sequence information to either 
strengthen a pathway which Is judged to be important in the biosynthesis of the target useful product In (tv) or 
weaken a pathway which is judged not to be important in the biosynthesis of the target useful product In (iv). 

(60) A coryneform bacterium, bred by the method of any one of (52) to (59). 

(61) The coryneform bacterium according to (60), which is a microorganism belonging to the genus Corynebac- 
terium, the jgenus Brevibacterium, or the genus Microbacterium. 

(62) The coryneform bacterium according to (61 ), wherein the microorganism belonging to the genus Corynebac- 
tcrium is selected from the group consisting of Coryncbactcrium gtutamicum, Coryncbactcrium acctoacidophitum, 
Corynebacterium acetoglutamicum, Corynebacterium caflunae, Corynebacterium herculis, Corynebacterium tfA 
tum, Corynebacterium melassecola, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

(63) A method for producing at least one compound selected from an amino acid, a nucleic acid, a vitamin, a 
saccharide, an organic acid and an analogue thereof, comprising: 

culturing a coryn form bacterium of any on of (60) to (62) in a medium to produce and accumulate at least 
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one compound set cted from an amino acid, a nucleic acid, a vitamin, a saccharid , an organic acid and 
analogues thereof; 

recovering the compound from the culture. 
(54) The method according to (63), wherein the compound is L-lysine. 

foHowtog eth ° d ' dentifyin9 a protein relatin 9 t0 use,ul mut at'°n based on proteome analysis, comprising the 

(i) preparing 

a protein derived from a bacterium of a production strain of a coryneform bacterium which has been sub- 
jected to mutation breeding by a fermentation process so as to produce at least one compound selected 
from an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogues thereof and 
a protein derived from a bacterium of a parent strain of the production strain; 

(ii) separating the proteins prepared In (i) by two dimensional electrophoresis; 

(hi) detecting the separated proteins, and comparing an expression amount of the protein derived from the 
production strain with that derived from the parent strain; 

(iv) treating the protein showing different expression amounts as a resull of the comparison with a peptidase 
to extract peptide fragments; 

(v) analyzing amino acid sequences of the peptide fragments obtained in (iv); and 

!n°K?oT««o 9 the amin ° SCid se( ' uences obtained in (v) with the amino acid sequence represented by SEQ 
ID NOS:3502 to 7001 to identifying the protein having the amino acid sequences. 

As used herein, the term "proteome", which is a coined word by combining "protein" with "genome", r fers to 
a method for examining of a gene at the polypeptide level. 

r^lHl m f th ° d a ~ ording to (65 >- wherein the coryneform bacterium is a microorganism belonging to the g nus 
corynebactenum, the genus Brevibactenum, or the genus Microbactehum 

ulU^ meth ° d accordin9 10 < 66 >- wnerein 106 microorganism belonging to the genus Corynebactenum is selected 
1°^ , f r ° UP con * ,stin9 of Corynebacterium glutamicum, Corynebactenum acetoacidophilum. Corynebactenum 
,' CU ™- Corynebactenum callunae. corynebactenum herculis, Corynebactenum lilium Corynebactenum 
meiassecola, Corynebactenum thermoaminogenes. and Corynebactenum ammoniagenes 
(68) A biologically pure culture of Corynebactenum glutamicum AHP-3 (FERM BP-7382). 

55 E^!LJ\ e PreSen , 1 inve " tion wi " be described below in more detail, based on the determination of the full nucl otide 
sequence of coryneform bacteria. 

1 . Determination of full nucleotide sequence of coryneform bacteria 

tJ he temi " co, y neform bacteria" as used herein means a microorganism belonging to the genus Corynebac- 
SoS a AlTm) V ' baCteriUmonhe genus Microbactenum as definedin Bergeys Manual of Determinative Bacte- 
S fl Example delude Corynebactenum acetoacidophilum, Corynebactenum acetoglutamicum. Corynebactenum 
TcSZ^Z ctenum j' UtamiCum - Corynebactenum herculis, Corynebactenum lilium, Corynebactenum rnetas- 
?T"L tnermoaminoa enes. Brevibactenum saccharolyticum, Brevibactenum immariophUum. Brevi- 
rS£?7 ' Brevtbactenum thiogenitalis. Microbactehum ammoniaphilum. and the like 

cum ATC^T a r ^ eX /^ P,e I int ; ,Ude ^y^^otenum acetoacidophilum ATCC 13870. Corynebactenum acetoglulami- 
,~h.,„S . ' Corynebactenum callunae ATCC 15991. Corynebactenum glutamicum ATCC 13032, Corynebac- 

fiaZmer t^OTi , f°' ^^^num glutamicum ATCC 13826 (prior genus and species: Brevibactenum 
iZTJZ Cor y nebactenum toctofermentum). Corynebactenum glutamicum ATCC 14020 (prior genus and species: 
ic^ton^Tmf'^T- t COryne ^ CteriUm g,Utam,CUm ATCC 1386 9 <P*r 9enus and species: Brevibactenum 
meS^ATer ^ *Z tC " Um h hcrcuUs ATCC 1386fi . Corynebactenum lilium ATCC 15990, Corynebactenum 
* ? ' c °n^melart U m thermoaminogenes FERM 9244. Brevibactenum saccharolyticum ATCC 

ATCC f'lS; ™ nari °P hil '' mATC C ^06B,BrevibacteriumroseumArcC^ess,Bmyibac 1 enummiogenital^ 

a i uc 1 9240, Microbactenum ammoniaphilum ATCC 1 5354. and th like. 
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(1) Preparation of genome DNA of coryneform bacteria 

[0022] Coryneform bacteria can be cultured by a conventional method. off « iont 
0023 Any of a natural medium and a synthetic medium can be used, so long as ,t ,s a ^ m J^^^^ 
culturing of the microorganism, and it contains a carbon source, a nitrogen source, an inorgan.c salt, and the hke wh.ch 
can be assimilated by the microorganism. ert Hi..m 
[0024] in OvmMh flftttmfcum. for example, a BY medium (7 g/l meat extract 10 g/1 peptone ^ *od'"m 
chloride, 5 g/l yeast extract, P H 7.2) containing 1% of glycine and the like can be used. The cultunng ,s earned out at 

tO^sr'AttTthewmpletion of the culture, the cells are recovered from the culture by centrifugation. The resulting 
cells are washed with a washing solution. .w^ki^ho mmol/ 

[0026] Examples of the washing solution include STE buffer (1 0.3% sucrose, 25 mmol/1 Tns hydrochlonde, 25 mmol/ 
I ethylenediaminetetraacetJc acid (hereinafter referred to as "EDTA"), pH 8.0), and the l.ke. nhtainina ae . 

rO0271 Genome DNA can be obtained from the washed cells according to a conventional method for obta.ning ge 
nomPofS name.* lysing the cel. wall of the cells using a lysozyme and a surfactant (SDS, etc.) eliminatms , proteins 
and the like using a phenol solution and a phenol/chloroform solution, and then precipitatng the genome DNA with 
ethanolorthelike. Specifically, the following method can be illustrated. A « are K alf i„„ 
[0028] The washed cells are suspended .n a washing solution containing 5 to 20 mg/l lysozyme. After shaking, 5 to 
20% SDS is added to lyse the cells. In usual, shaking is gently performed at 25 to 40'C.for30 m.nules lo 2 hours. After 
shaking, the suspension is maintained at 60 to 70»C for 5 to 1 5 minutes for the lysis. 

[0029] Afterthe lysis, the suspension is cooled to ordinary temperature, and 5 to 20 ml of Tns-neutrahzed phenol is 
added thereto, followed by gently shaking at room temperature for 15 to 45 minutes. i ftW r 
[0030] After shaking, centrifugation (15.000 x g. 20 minutes, 20»C) is carried out to fract.onate the a < u °°^ '*y er ; 
[0031 ] Aftorpcrforming extraction with phenol/chloroform and extraction with chloroform (twice) in the same manner^ 
3 mol/l sodium acetate solution (pH 5.2) and isopropanol are added to the aqueous layer at 1/1 0 times ^volum and 2 
times volume, of the aqueous layer, respectively, followed by gently stirring to precipitate the genome DNA_ 
[0032] The genome DNA is dissolved again in a buffer containing 0.01 to 0.04 mg/ml RNase As an exampte ^or th 
buffer TE buffer (10 mmol/l Tris hydrochloride. 1 mol/l EDTA, P H 8.0) can be used. After dissolving. 
solution is maintained at 25to40-Cfor20to50minutes a ndthenextractedsuccessive.y with phenol. phenol/chloroform 

and chloroform as in the above case. . . ..„,,„ ic ^.. hpH with 

[0033] After the extraction, isopropanol precipitation is carried out and the result.ng DN A precip itate is washed wrth 
70% ethanol, followed by air drying, and then dissolved in TE buffer to obtain a genome DNA solution. 

(2) Production of shotgun library 

[0034] A method for produce a genome DNA library using the genome DNA of the coryneform bacteria Prepared in 
the above (1 ) include a method described in Molecular Cloning, A laboratory Manual. Second Edrtion (1 989) £™«*™ T 
referred to as -Molecular Cloning, 2nd ed."). In particular, the following method can be exemplified to prepare a gen m 
DNA library appropriately usable in determining the full nucleotide sequence by the s^""™™*- ftiieh „ TC 
[0035] To 0.01 mg of the genome DNA of the coryneform bacteria prepared in the above (1) a buffer such as TE 
luffer or the like. Is added to give a total volume of 0.4 ml. Then, the genome DNA '^T^^^J^ % f 
10 kb with a sonicator (Yamato Powersonic Model 50). The treatment with the son.cator is performed at an output t 

fmV^Thel^s^g^nome DNAf ragments are blunt-ended using DNAblunting kit (manufactured by Takara Shuzo) 

[0037]" k The blunt-ended genome fragments are fractionated by agarose gel or polyacrylamide gel electrophor sis 

and genome fragments of 1 lo 2 kb are cut out from the gel. 

[0038] To the gel. 0.2 to 0.5 ml of a buffer for eluting DNA. such as MG elution buffer (0.5 mol/l ammon urn acetat 
10 mmo!/. magnesium acetate. 1 mmol/l EDTA. 0.1% SDS) or the like, is added, followed by shakmg at 25 to 40 C 

overnight to elute DNA. ^ . ... .. . t -Ktoin n 

[0039] The resulting DNA eluate is treated with phenol/chloroform and then precipitated with ethanol to obtain a 

Biotech) or the like, using T4 ligase (manufactured by Takara Shuzo) or the like. The ligation can be cam d out by 

allowing a mixture to stand at 10 to 20»C for 20 to 50 hours. , TC K.rf»«r 

[0041] The resulting ligation product is precipitated with ethanol and diss Ived m 5 to 20 ul of TE i buWor- 

[0042 Escherichia coli is transformed in accordance with a conventional method using 0.5 to ,*£ I oMhe l.gahon 

solution. Examples of th transformation method include the el ctroporation method using ELECTRO MAX dhiub 
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SnXt^l b s y c !;L ri TeC ?H n0,09ieS ; EsChericN " "* The e'ectroporation method can be carried out under the 
conditions as described in the manufacturer's instructions. 

!l 0 t?LI t !!, tran f Orrned 1 5 SC/ ' e " C ' ,/a * spread on a suitab,e se,ec «on medium containing agar, for example LB 
ZrT^T^Zla T amPiCi " in (LB mediUm 0 ° 9/1 bact0l ^ on - 5 *■ y-s? ettract. 1 0 j£5um 

ra!^ , ^ i£ 1 ^ 1 3L -t fc 0 l aSar) . When PUC18 ' S US6d 35 ,he Clonin9 vector - and cuftured therein 
thelfinsfermint k I 0b . ta ' ned " C °'° nieS f0rmed 00 the p,ate medium - this * is possible to select 
SroSS STr , , recombinam DNA containi "9 genome DNA as white colonies by adding X-gal and 
IPTG (isopropyl-p-thiogalactopyranoside) to the pfate medium 

S^lnoVl^SSS™" ^ TT ? Sta " d <0r CU,tUrin9 in 8 96 " We " Werplate to which 0 05 ml of the LB medium 

^SS^SS^ A LoThT I &S b6 f " addBd e3Ch We "- The reSU,tin 9 ^ can be used in an experiment of 

con^S ^20-/ alvcemfi J! » " Ut '° n ^ * - 8 °° C by addin 9 0 05 ml per we » of *• LB medium 

comammg 20 /o glycerol to the culture solution, followed by mixing, and the stored culture solution can be used at any 

(3) Production of cosmid library 

96n0me ° NA (0 "' m9) ° f the cor y ne,omi bacteria prepared in the above (1) is partially diqested with a 
TSTZZ^'f 0 *? Sa ° 3A ' ° r ^ ,ike ' and the " ^"trifuged (26-000 rpm. 18 hours. 3*fiZ£?™Z 
Toi ZZll m'STioT 8 ' 1 0% K S l CrDSe bUffer ° ^ NaCl ' 20 mmOW ThS ^^.oride. 5 mmol/, EDTA, 
RNMn XfteMh e ii^ , < etevaMn 9 concentration of the 10% sucrose buffer to 40%). 
%£L£Z DNA frlmfnT" f # MP ~ d S ° IUti ° n * ,ra <*°"ated into tubes in 1 ml per each tube. Aft r 
SutTK SSoSS^^ ,raCt, ° n ^ a9arOSS 961 e '-^^resis, a fraction rich in DNA f ragm nts of 

!EZ* l !Sr^ fS ,i9atCd ,0 8 °° Smid VOCt ° r havi "9 a cohcs -° °" d ^ich can bo ligatod to the 

o a eTa^ P .^h^ ^T < a,ly di9eSted ^ thD Partial, y di 9 ested P roduct ^gated to, 

™ OTpto ' th#aMHU,toof « u PefC«1 (manufactured by Stratagene) in accordance with the manufacture'sTnstruc: 

d 0 e ^riLJi e «7? U,ti ? 9 'i?^ 0 " Pr ° dUCt iS packa 9 ed "Sing a packaging extract which can be prepared by a meth d 
£oducT d packaS^T ^ ^ the " transtormi "9 «* More s^f^V the^tln 

< !? . ! 9 ' example ' a commercially available packaging extract. Gigapack III Gold Pacteaina 

cnericnia colt XL-1 -BlueMR (manufactured by Stratagene) or the like 

theSi. tranSfo,Tned «* & ^read on an LB plate medium containing ampicillin, and cultur d 

El L ranS ! 0nT,ant can be °°tained as colonies formed on the plate medium. 

S^,^^? 9 cu,ture in a 96 - we " titer p,ate to wh * h 005 ml of - LB 

SSL^S^S^nlS ^ emP, ° yed a " experiment of W scribed °elow. Also, the culture solution can 

py mixing, and the stored culture solution can be used at any time. 

(4) Determination of nucleotide sequence 
(4-1) Preparation of template 

45 

f^L^lm^T SeqU T 6 °' 9en0me ° NA °' cor y" eform b a«erla can be determined basically according 
to me whole genome shotgun method (Science, 269: 496-512 (1 995)) 

SI? SS?^? n "SS * mMhod 080 " PCR •-«— 

50 [0056] Specifically, the template can be prepared as follows 

b^NCTmSoihtS in? 6 T! e 9en ° me Sh ° t9Un ,ibmry fe inOCU,ated ^ usi "9 a rep,ic ator (manufactured 
hLbeirJSlJ f * C, ' 0,a96 - W0,,p,atctowhfch 008 mlpcrwcll of thoLBmodiumcontaining0.1mg/ml ampicillin 
has been added, followed by stationarily culturing at 37»C overnight ampicmin 

55 pe 0 !S, ' ^ CU ' tUre SOhrti ° n fe transported, using a copy plate (manufactured by Tokken) into each w II of a 

protocSbt - |Z2 0 ^? a o«J o (man ^ ac,ured b y Taka ra Shuzo). Then. PCR is carried out in accordance with the 
B^^ < 1998 » -» ^eneAmp PCR System 9700 (manufactured by PE 
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l0 0591 The exc ssive primers and nucleotides ar eliminated using a kit for purifying a PCR product, and the product 
is used as the template in the sequencing reaction. wr.M* n i« m iH«atemDlate 
[00601 It is also possible to determine the nucleotide sequence using a double-stranded DM A plasm.d as a template. 

0061 The double-stranded DNA plasmid used as the template can be obtained by the fo lowing method 

0062 The Cone derived from the whole genome shotgun library is inocu.ated mto each well ^^^TStfJw 
to which 1 .5 m. per well of a 2 * YT medium (1 6 grt bactotrypton, 1 0 g/l yeast extract. 5 £ sod ' u ^ tonde ' pH 7 0) 
containing 0.05 mg/ml ampicillin has been added, followed by cutturing under shak.ng at 37-C °^ht. 

[0063] The double-stranded DNA plasmid can be prepared from the culture solut.on usmg ; J""™^^^ 
preparing machine KURABO PI-50 (manufactured by Kurabo Industnes), a mult.screen (manufactured by M.ll.pore) 

or the like, according to each protocol. ..«h 

[0064] To purify the plasmid, Biomek 2000 manufactured by Beckman Coulter and the like can be useti L 

[0065] The resufting purified double-stranded DNA plasmid is dissolved in water to g.ve a concentrate of about 0.1 

mg/ml. Then, it can be used as the template in sequencing. 

(4-2) Sequencing reaction 

[0066] The sequencing reaction can be carried out according to a commercially available sequence kit or the like. A 
specific method is exemplified below. . un „ ...-....-a 

[0067] To 6 nl of a solution of ABI PRISM BigDye Terminator Cycle Sequencing Ready ReacUon K * f™"' d * u ' ed r 
by PE Biosystems). 1 to 2 pmol of an M13 regular direction primer (M13-21) or an M1 3 reverse d^rect.on pnrn r 
(MI3REV) {DNA Research. 5: 1-9 (1998)) and 50 to 200 ng of the template prepared in the above (4-1) (the PCH 
product or plasmid) to give 1 0 u,l of a sequencing reaction solution. -„.,_. 
[0068] AdyeterminaLsequencingreaction^^^ 

PCR System 9700 (manufactured by PE Biosystems) or tho like. The cycle parameter can be, ^^^^'^ 
with a commercially available kit, for example, the manufacture's instructions attached w.th ABI PRISM B.g Dy Ter 

minator Cycle Sequencing Ready Reaction Kit. .... /„«,„... 

[0069] The sample can be purified using a commercialty available product, such as Mutt, Screen HV plate (manu 
factured by Millipore) or the like, according to the manufacture's instructions. ana , u - L «. The dried 

[0070] The thus purified reaction product is precipitated with ethanol. dned and then used for the analys^ The dned 
reaction product can be stored in the dark at -30»C and the stored reaction product can be used at any ' *me. 
[OT71] The dried reaction product can be analyzed using a commercially available sequencer and an analyzer ac- 

t0 ^~^^«Me sequencer inciude AB, PR.SM 377 DNA Sequencer gj—i 
by PE Biosystems). Example of the analyzer include ABI PRISM 3700 DNA Analyzer (manufactured by PE Biosystems). 

(5) Assembly 

[0073] A software, such as phred (The University of Washington) or the like can be ^ » ^JJ^jJ J 
analyzing the sequence information obtained in the above (4). A software, such as ^^^^^t^rZll 
Washington) or SPS Cross.Match (manufactured by Southwest Parallel Software) or the l.ke. can be used to mask 

lou^o^ -h as Phrap (The University of Washington). SPS phrap (manufactured by 

5^E2ESX£ o ( r Ts^s thereof, a computer such as UNIX, PC. Macintosh, and th 

[TOT?]" Con^obtained by the assembly can be analyzed using a graphical editor such as consed (The Univ rslty 

So^ hi lttlopo e i to perform a series of the operations from the base cal. to the assembly in a lump using a 
script phredPhrap attached to the consed. 

[0078] As used herein, software will be understood to also be referred to as a comparator. 

(6) Determination of nucleotide sequence in gap part 

[0079] Each of the cosmids in the cosmid library constructed in the above (3) is prepared in the same fanner as in 
th preparation of th double-stranded DNA p.asmid described in the above (4-1 ). The ' "^'^'f/^" ^S gDy e 
of m insert fragment of the cosmid is d termined using a commercially ava.lab.e kit. such as AB ■ 
Terminator Cyel Sequencing Ready Reaction Kit (manufactured by PE Biosystems) according to the manufacture s 

instructions. 
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[0080] About 800 cosmid clones are sequenced at both ends f the inserted fragment to detect a nucleotide sequence 
in i the coring derived from the shotgun sequencing obtained in (5) which is coincident with the sequence Thus the 
cham bnkage between respective cosmid clones and respective contigs are clarified, and mutual alignment is carried 
out. Furthermore, the results are compared with known physical maps to map the cosmids and the contigs In case of 

r^«7 m <7 ' Utem/Ct/n7ATCC 13032 ' 3 physical ™P of MoL Gen - Genet > 252: 255-265 (1996) can be used 

[0081] The sequence in the region which cannot be covered with the contigs (gap part) can be determined by the 

Toiiowing method. 7 

[0082] Clones containing sequences positioned at the ends of the contigs are selected. Among these, a clone wherein 

, D TTH. inSerted ,ra 9 ment nas been determined is selected and the sequence at the opposite end of the 

'0 inserted fragment is determined. ° ,me 

[0083] A shotgun library clone or a cosmid clone derived therefrom containing the sequences at the respective ends 
of the inserted fragments in the two contigs is identified and the full nucleotide sequence of the inserted fragment of 
the clone is determined. a 
[0084] According to this method, the nucleotide sequence of the gap part can be determined 
[0085] When no shotgun library clone or cosmid clone covering the gap part is available, primers complementary to 
the end sequences of the two different contigs are prepared and the DNA fragment in the gap part is amplifi d Then 
sequencing is performed by the primer walking method using the amplified DNA fragment as a template or "by the 
shotgun method m .which the sequence of a shotgun clone prepared from the amplified DNA fragment is determined. 
Thus, the nucleotide sequence or the above-described region can be determined 

NAVK5ATINR fuiw-" S , equence accu ™* Primers are synthesized using AUTOFINISM function and 

m^Z^T™ T ° fCOnsed ^ eUnivere ^ ofWasnin 9 1 °n)and the sequence is determined by the primer walking 
metnod to improve the sequence accuracy. 

[0087] Examples of the thus determined nucleotide sequence of the full genome include the full nucleotide sequen e 
of genome of Coryncbactcrium giutamicum ATCC 13032 represented by SEQ ID NO:1 . 

bySEQ IDNO-T ° f nUCle ° tide se " uenCG of microorganism genome DNA using the nucleotide sequence repres nted 

[0088] Anucleotideseque TO eofapolynucleotidehavingahomologyof80%ormorewiththefullnucleotidesequence 
of Corynebactenum giutamicum ATCC 13032 represented by SEQ ID NO:1 as determined above can also be det r- 
m.ned using the nucleotide sequence represented by SEQ ID NO:1 , and the polynucleotide having a nucleotide se- 
quence havmg a homology of 80% or more with the nucleotide sequence represented by SEQ ID NO 1 of the present 
invention is within the scope of the present invention. The term "polynucleotide having a nucleotide sequenc having 
oolvnuSL - k- r K m °r,r ,,h nuc,eotide sequence represented by SEQ ID NO:1 of the present invention" is a 
polynucleotide in which a full nucleotide sequence of the chromosome DNA can be determined using as a primer an 
oligonucleotide composed of continuous 5 to 50 nucleotides in the nucleotide sequence represented by SEQ ID NO 

minatitr^rf: 1T^ 9 !°/ CR the chromosome DNA * Opiate. A particularly preferred primer in deter^ 
m.nat on of the full nucleot.de sequence is an oligonucleotide having nucleotide sequences which are positioned at 

Jl?J^ 7? ™? l ° 500 bp> 8nd amon9 such oligonucleotides, an oligonucleotide having a nucleotide sequence 
selected f rorr . DNAs encoding a protein relating to a main metabolic pathway is particularly preferred. The p lynucle- 
ot.de in which the full nucleotide sequence of the chromosome DNA can be determined using the oligonucleotide 
ncludes potynucleoHdes constituting a chromosome DNA derived from a microorganism belonging to coryneform bac- 
teria. Such a polynucleotide is preferably a polynucleotide constitutingchromosome DNA derived from a micro rgan.sm 

« nlSSII? . 96nUS Cor y nebacterium - more Preferably a polynucleotide constituting a chromosome DNA f Co- 
rynebactenum giutamicum. 

2. kjenlificalion of ORF (open reading frame) and expression regulatory fragment and determination of the function of 
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40 



50 S^S BaSe °\° n th * ,u " ""cleotide sequence data of the genome derived from coryneform bacteria determined in 

thl « T 1 ™ a " exDression modulating fragment can be identified. Furthermore, the function of the 

thus determined ORF can be determined. 

[0090] The ORF means a continuous region in the nucleotide sequence of mRNA which can be translated as an 

55 r1Z7, aC li SeqUenCe t0 matUre l ° 8 pr ° ,ein - A re9ion of the DNA codin 9 ,or ™e ORF of mRNA is also called ORF. 
ST 1 . ™ e ?P ression modulating fragment (hereinafter ref rred to as "EMF") is used h r in t defin a sen s of 
polynucleotide fragments which modulat the xpression of the ORF or another sequence ligated operatably thereto. 
l„r!r PreSS 'f " modu,a,e tn expression of a sequence ligated operatably- is used herein to refer to changes in the 
express.on of a sequence due to the presence of the EMF. Examples of the EMF include a promoter, an operator, an 
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enhancer, a silencer, a ribosome-binding sequence, a transcriptional termination sequence, and the like. In coryn«onn 
bacteria, an EMF is usually present in an intergenic segment (a fragment positioned between two genes; about 1 0 to 
200 nucleotides in length). Accordingly, an EMF is frequently present in an intergenic segment of 10 nucleotides or 
longer. It is also possible to determine or discover the presence of an EMF by using known EMF sequences as a target 
5 sequence or a target structural motif (or a target motif) using an appropriate software or comparator, such as FASTA 
|K£ Natl. Acad Sd. USA, 85: 2444-48 (1988)). BLAST (J. Mol. Biol., 275:403-410 (1990)) orthe like. Also, it can 
be identified and evaluated using a known EMF-capturing vector (for example, P KK232-8; manufactured by Amersham 

Pharmacia Biotech). „ . 

r0092] The term "target sequence" is used herein to refer to a nucleotide sequence composed of 6 or more nucie- 
io otides an amino acid sequence composed of 2 or more amino acids, or a nucleotide sequence encoding this ammo 
acid sequence composed of 2 or more amino acids. A longer target sequence appears at random in a data base at 
the lower possibility. The target sequence is preferably about 1 0 to 100 amino acid residues or about 30 to 300 nucle- 

r0093i eS The S term "target structural motif" or target motif" is used herein to refer to a sequence or a combination of 
»5 sequences selected optionally and reasonably. Such a motif is selected on the basis of the threedimens.onal structure 
formed by the folding of a polypeptide by means known to one of ordinary skill in the art. Various motives are known. 
[0094] Examples of the target motif of a polypeptide include, but are not limited to, an enzyme activity site, a prot in- 
protein interaction site, a signal sequence, and the like. Examples of the target motif of a nucleic acid include a promoter 
sequence, a transcriptional regulatory factor binding sequence, a hair pin structure, and the like. 
20 [0095] Examples of highly useful EMF include a high-expression promoter, an inducible-expression P romo « r - ana 
the like. Such an EMF can be obtained by positionally determining the nucleotide sequence of a gene wh.cri , is i Known 
or expected as achieving high expression (for example, ribosomal RNA gene: GenBank Accession No. M1 61 75 or 
Z46753) or a gene showing a desired induction pattern (for example, isocitrate lyase gene induced by acetic acid. 
Japanese Published Unexamined Patent Application No. 56782/93) via the alignment with the full genome nucleotide 
25 sequence determined in the above item 1 , and isolating the genome fragment in the upstream part (usually 200 to 500 
nucleotides from the translation initiation site). It is also possible to obtain a highly useful EMF by selecting^ an EMF 
showing a high expression efficiency or a desired induction pattern from among promoters captured by the EMF- 
capturing vector as described above. 

[0096] The ORF can be identified by extracting characteristics common to individual ORFs. constructing a general 
30 model based on these characteristics, and measuring the conformity of the subject sequence with the model. In th 
identification, a software, such as GeneMark (Nuc. Acids. Res.. 22 4756-67 (1994): manufactured by Gen Pro) . 
GeneMark.hmm (manufactured by GenePro). GeneHacker [Protein. Nucleic Acid and Enzyme. 42. 3001-07 ^J"- 
Glimmer {Nuc. Acids. Res., 26: 544-548 (1 998): manufactured by The Institute of Genomic Research), or the like, can 
be used. In using the software, the default (initial setting) parameters are usually used, though the parameters can be 

[OOgT] 3 '^^ above-described comparisons, a computer, such as UNIX. PC, Macintosh, or the like, can be usecL 
[0098] Examples of the ORF determined by the method of the present invention include ORFs having the nucleotide 
sequences represented by SEQ ID NOS:2 to 3501 present in the genome of Corynebacterium ^tamKumasrep^ 
sented by SEQ ID NO:1 . In these ORFs, polypeptides having the amino acid sequences represented by SEQ ID NOS. 

40 3502 to 7001 are encoded. .i.k.noci.h 
[0099] The function of an ORF can be determined by comparing the identified ammo acid sequence °^ he ' 
known homologous sequences using a homology searching software or comparator, such as BLAST, ^ S T, Smrth * 
Waterman [Meth. Enzym., 164. 765 (1988)) orthe like on an aminoacid data base, such as Swjh-Prot. F»IR, £enBank- 
nr-aa GenPept constituted by protein-encoding domains derived from GenBank data base, OWL or the like. 

45 [0100] Furthermore, by the homology searching, the identity and similarity with the amino acid sequences of kn wn 
proteins can also be analyzed. . . , „ r „ 

[0101] With respect or the term "idenliiy" used herein, where two polypeptides each hav.ng 10 ammo acids are 
different in the positions of 3 amino acids, these polypeptides have an identity of 70% with each other. In case wh rem 
one of the different 3 amino acids is analogue (for example, leucine and isoleucine). these polypeptides have a similarity 

so of B0%- 

[0102] As a specific example, Table 1 shows the registration numbers in known data bases of sequences which are 
judged as having the highest similarity with the nucleotide sequence of thcORF derivedfrom Cc^nobactonumgManv- 
cum ATCC 13032, genes of these sequences, functions of these genes, and identities thereof compared wrth kn wn 

amino acid translation sequences. . . . h 

55 [01 03] Thus, a great number of nov I genes derived from coryneform bacteria can b identified by d termming th 
full nucleotide sequ nee of the genome d rived from coryneform bacterium by th means of the present inv ntion. 
Moreover, the function of the proteins encoded by these genes can be determined. Since coryneform bactena are 
industrially highly useful microorganisms, many of th identified genes are industrially us ful. 
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[0104] Moreov r, the characteristics of respective microorganisms can be clarified by classifying the functions thus 
determined. As a result, valuable information in breeding is obtained. 

[0105] Furthermore, from the ORF information derived from coryneform bacteria, the ORF corresponding to the 
microorganism is prepared and obtained according to the general method as disclosed in Molecular Cloning. 2nd ed. 

5 or the like. Specifically, an oligonucleotide having a nucleotide sequence adjacent to th ORF is synthesized, and the 
ORF can be isolated and obtained using the oligonucleotide as a primer and a chromosome DNA derived from co- 
ryneform bacteria as a template according to the general PCR cloning technique. Thus obtained ORF sequences 
include polynucleotides comprising the nucleotide sequence represented by any one of SEQ ID NOS:2 to 3501 . 
[0106] The ORF or primer can be prepared using a polypeptide synthesizer based on the above sequence informa- 

10 tion. 

[0107] Examples of the polynucleotide of the present invention include a polynucleotide containing the nucleotide 
sequence of the ORF obtained in the above, and a polynucleotide which hybridizes with the polynucleotide under 
stringent conditions. 

[0108] The polynucleotide of the present invention can be a single-stranded DNA, a double-stranded DNA and a 
is single-stranded RNA, though it is not limited thereto. 

[0109] The polynucleotide which hybridizes with the polynucleotide containing the nucleotide sequence of the ORF 
obtained in the above under stringent conditions includes a degenerated mutant of the ORF. A degenerated mutant is 
a polynucleotide fragment having a nucleotide sequence which is different from the sequence of the ORF of the present 
invention which encodes the same amino acid sequence by degeneracy of a gene code. 

[0110] Specific examples include a polynucleotide comprising the nucleotide sequence represented by any one f 
SEQ ID NOS:2 to 3431 , and a polynucleotide which hybridizes with the polynucleotide under stringent conditions. 
[0111] A polynucleotide which hybridizes under stringent conditions is a polynucleotide obtained by colony hybridi- 
zation, plaque hybridization, Southern blot hybridization or the like using, as a probe, the polynucleotide having th 
nucleotide sequence of tho ORF identified in the above. Specific examples include a polynucleotide which can be 
identified by carrying out hybridization at 65°C in the presence of 0.7-1 .0 M NaCI using a filter on which a polynucleotide 
prepared from colonies or plaques is immobilized, and then washing the filter with 0.1 x to 2x SSC solution (the com- 
position of Ix SSC contains 150 mM sodium chloride and 15 mM sodium citrate) at 65°C. 

[0112] The hybridization can be carried out in accordance with known methods described in, for example, Molecular 
Cloning, 2nd ed. ( Current Protocols in Molecular Biology, DNA Cloning 1: Core Techniques, A Practical Approach, 
Second Edition, Oxford University (1995) or the like. Specific examples of the polynucleotide which can be hybridized 
include a DNA having a homology of 60% or more, preferably 80% or more, and particularly preferably 95% or more, 
with the nucleotide sequence represented by any one of SEQ ID NO:2 to 3431 when calculated using default (initial 
setting) parameters of a homology searching software, such as BLAST, FASTA, Smith-Waterman or the like. 
[01 13] Also, the polynucleotide of the present invention includes a polynucleotide encoding a polypeptide comprising 
the amino acid sequence represented by any one of SEQ ID NOS:3502 to 6931 and a polynucleotide which hybridiz s 
with the polynucleotide under stringent conditions. 

[0114] Furthermore, the polynucleotide of the present invention includes a polynucleotide which is present in the 5' 
upstream or 3' downstream region of a polynucleotide comprising the nucleotide sequence of any one of SEQ ID NOS: 
2 to 3431 in a polynucleotide comprising the nucleotide sequence represented by SEQ ID NO:1 , and has an activity 
of regulating an expression of a polypeptide encoded by the polynucleotide. Specific examples of the polynucleotide 
having an activity of regulating an expression of a polypeptide encoded by the polynucleotide includes a polynucleotid 
encoding the above described EMF, such as a promoter, an operator, an enhancer, a silencer, a ribosom -binding 
sequence, a transcriptional termination sequence, and the like. 

[0115] The primer used for obtaining the ORF according to the above PCR cloning technique includes an oligonu- 
cleotide comprising a sequence which is the same as a sequence of 1 0 to 200 continuous nucleotides In the nucleotide 
sequence of the ORF and an adjacent region or an oligonucleotide comprising a sequence which is complementary 
lo the oligonucleotide. Specific examples include an oligonucleotide comprising a sequence which is the same as a 
sequence of 10 to 200 continuous nucleotides of the nucleotide sequence represented by any one of SEQ ID NOS:1 
to 3431 , and an oligonucleotide comprising a sequence complementary to the oligonucleotide comprising a sequ nee 
of at least 1 0 to 20 continuous nucleotide of any one of SEQ ID NOS: 1 to 3431 . When the primers are used as a s ns 
primer and an antisense primer, the above-described oligonucleotides in which melting temperature (T m ) and the 
number of nucleotides arc not significantly different from each other are preferred. 

[0116] The oligonucleotide of the present invention includes an oligonucleotide comprising a sequence which is th 
same as 10 to 200 continuous nucleotides of the nucleotide sequence represented by any one of SEQ ID NOS:1 to 
3431 or an oligonucleotide comprising a sequence complementary to the oligonucleotide. 

[0117] Also, analogues of these oligonucleotides (hereinafter also referred to as "analogous oligonucleotides") are 
also provided by the present invention and are us ful in the methods described herein. 

[0118] Examples of th analogous oligonucleotides include analogous oligonucleotides in which a phosphodiester 



15 

4SDCCID:<EP 11087&OA2 I > 



30 



35 



40 



45 



50 



55 



EP 1 108 790 A2 

bond in an oligonucl otide is converted to a phosphorothioate bond, analogous oligonucleotides in which a phosphodi- 
ester bond in an oligonucleotide is converted to an N3'-P5' phosphoamidate bond, analogous oligonucleotides in which 
ribose and a phosphodiester bond in an oligonucleotide is converted to a peptide nucleic acid bond, analogous oligo- 
nucleotides in which uracil in an oligonucleotide is replaced with C-5 propynyluracil, analogous oligonucleotides in 

5 which uracil in an oligonucleotide is replac d with C-5 thiazoluracil, analogous oligonucleotides in which cytosine in 
an oligonucleotide is replaced with C-5 propynylcytosine, analogous oligonucleotides in which cytosine in an oligonu- 
cleotide is replaced with phenoxazine-modified cytosine, analogous oligonucleotides in which ribose in an oligonucle- 
otide is replaced with 2*-0-propytribose, analogous oligonucleotides in which ribose in an oligonucleotide is replaced 
with 2'-methoxyethoxyribose, and the like (Ceil Engineering, 16: 1463 (1997)). 

10 [0119] The above oligonucleotides and analogous oligonucleotides of the present invention can be used as probes 
for hybridization and antisense nucleic acids described below in addition to as primers. 

[0120] Examples of a primer for the antisense nucleic acid techniques known in the art include an oligonucleotide 
which hybridizes the oligonucleotide of the present invention under stringent conditions and has an activity regulating 
expression of the polypeptide encoded by the polynucleotide, in addition to the above oligonucleotide. 
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3. Determination of isozymes 



[0121] Many mutants of coryneform bacteria which are useful in the production of useful substances, such as amino 
acids, nucleic acids, vitamins, saccharides, organic acids, and the like, are obtained by the present invention. 

20 [0122] However, since the gene sequence data of the microorganism has been, to date, insufficient, useful mutants 
have been obtained by mutagenic techniques using a mutagen, such as nitrosoguanidine (NTG) or the like. 
[0123] Although genes can be mutated randomly by the mutagenic method using the above-described mutag n ( all 
genes encoding respective isozymes having similar properties relating to the metabolism of intermediates cannot be 
mutated. In the mutagenic method using a mutagen, genes arc mutated randomly. Accordingly, harmful mutations 

25 worsening culture characteristics, such as delay in growth, accelerated foaming, and the like, might be imparted at a 
great frequency, in a random manner. 

[0124] However, if gene sequence information is available, such as is provided by the present invention, it is p ssible 
to mutate all of the genes encoding target isozymes, in this case, harmful mutations may be avoided and the target 
mutation can be incorporated. 

30 [0125] Namely, an accurate number and sequence information of the target isozymes in coryneform bacteria can b 
obtained based on the ORF data obtained in the above item 2. By using the sequence information, all of the targ t 
isozyme genes can be mutated into genes having the desired properties by. for example, the site-specific mutag nesis 
method descrtoed in Molecular Cloning, 2nd ed. to obtain useful mutants having elevated productivity of useful sub- 
stances. 



4. Clarification or determination of biosynthesis pathway and signal transmission pathway 



[0126] Attempts have been made to elucidate biosynthesis pathways and signal transmission pathways in a numb r 
of organisms, and many findings have been reported. However, there are many unknown aspects of coryneform bac- 
40 teria since a number of genes have not been identified so far. 

[0127] These unknown points can be clarified by the following method. 

[0128] The functional information of ORF derived from coryneform bacteria as identified by the method of above item 
2 is arranged. The term -arranged" means that the ORF is classified based on the biosynthesis pathway of a substance 
or the signal transmission pathway to which the ORF belongs using known information according to the functional 
45 information. Next, the arranged ORF sequence information is compared with enzymes on the biosynthesis pathways 
or signal transmission pathways of other known organisms. The resulting information is combined with known data on 
coryneform bacteria. Thus, the biosynthesis pathways and signal transmission pathways in corynerorm bacteria, which 
have been unknown so far, can be determined. 

[0129] As a result that these pathways which have been unknown or unclear hitherto are clarified, a useful mutant 
50 for producing a target useful substance can be efficiently obtained. 

[0130] When the thus clarified pathway is judged as important in the synthesis of a useful product, a useful mutant 
can bo obtained by selecting a mutant wherein this pathway has been strengthened. Also, when the thus clarified 
pathway^ judged as not important in the biosynthesis of the target useful product, a useful mutant can be obtain d 
by sel cting a mutant wh r in the utilization frequency of this pathway is lowered. 



5. Clarification or determination of us ful mutation point 

[0131] Many useful mutants of coryneform bacteria which ar suitable for th production of useful substances, such 
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as amino acids, nucleic acids, vitamins, saccharides, organic acids, and the like, hav be n obtained. However, it is 
hardly known which mutation point is imparted to a gene to improve the productivity. 

[0132] However, mutation points contained in production strains can be identified by comparing desired sequences 
of the genome DNA of the production strains obtained from coryneform bacteria by the mutagenic technique with the 
5 nucleotide sequences of the corresponding genome DNA and ORF derived from coryneform bacteria determined by 
the methods of the above items 1 and 2 and analyzing them 

[0133] Moreover, effective mutation points contributing to the production can be easily specified from among these 
mutation points on the basis of known information relating to the metabolic pathways, the metabolic regulatory mech- 
anisms, the structure activity correlation of enzymes, and the like. 
10 [0134] When any efficient mutation can be hardly specified based on known data, the mutation points thus identified 
can be introduced into a wild strain of coryneform bacteria or a production strain free of the mutation. Then f it is examined 
whether or not any positive effect can be achieved on the production. 

[0135] For example, by comparing the nucleotide sequence of homoserine dehydrogenase gene horn of a lysine- 
producing B-6 strain of Corynebacterium glutamicum (AppL Microbiol. BiotechnoL, 32: 269-273 (1989)) with the nu- 
ts cleotide sequence corresponding to the genome of Corynebacterium glutamicum ATCC 1 3032 according to the present 
invention, a mutation of amino acid replacement in which valine at the 59-position is replaced with alanine (Val59Ala) 
was identified. A strain obtained by introducing this mutation into the ATCC 13032 strain by the gene replacem nt 
method can produce lysine, which indicates that this mutation is an effective mutation contributing to the production 
of lysine. 

20 [0136] Similarly, by comparing the nucleotide sequence of pyruvate carboxylase gene pyc of the B-6 strain with the 
nucleotide sequence corresponding to the ATCC 13032 genome, a mutation of amino acid replacement in which proline 
at the 458-position was replaced with serine (Pro45BSer) was identified. A strain obtained by introducing this mutation 
into a lysine-producing strain of No. 58 (FERM BP-7134) of Corynebacterium glutamicum free of this mutation shows 
an improved lysine productivity in comparison with the No. 58 strain, which indicates that this mutation is an effective 

2£ mutation contributing to the production of lysine. 

[0137] In addition, a mutation A1 a21 3Thr in glucose-6-phosphate dehydrogenase was specified as an effective mu- 
tation relating to the production of lysine by detecting glucose-6-phosphate dehydrogenase gene zwfoi the B-6 strain. 
[0138] Furthermore, the lysine-productivrty of Corynebacterium glutamicum was improved by replacing the base at 
the 932-position of aspartokinase gene lysC of the Corynebacterium glutamicum ATCC 1 3032 genome with cytosine 

30 to thereby replace threonine at the 31 1 -position by isoleucine, which indicates that this mutation is an effective mutation 
contributing to the production of lysine. 

[0139] Also, as another method to examine whether or not the identified mutation point is an effective mutation, th r 
is a method in which the mutation possessed by the lysine-producing strain is returned to the sequence of a wild typ 
strain by the gene replacement method and whether or not it has a negative influence on the lysine productivity. For 
35 example, when the amino acid replacement mutation Val59Ala possessed by horn of the lysine-producing B-6 strain 
was returned to a wild type amino acid sequence, the lysine productivity was lowered in comparison with the B-6 strain. 
Thus, it was found that this mutation is an effective mutation contributing to the production of lysine. 
[0140] Effective mutation points can be more efficiently and comprehensively extracted by combining, if needed, th 
DNA array analysis or proteome analysis described below. 

40 

6. Method of breeding industrially advantageous production strain 



[0141] It has been a general practice to construct production strains, which are used industrially in the fermentation 
production of the target useful substances, such as amino acids, nucleic acids, vitamins, saccharides, organic acids, 
and the like, by repeating mutagenesis and breeding based on random mutagenesis using mutagens, such as NTG 
or the like, and screening. 

[0142] In recent years, many examples of improved production strains have been made through the use of recom- 
binant DNA techniques. In breeding, however, most of the parent production strains to be improved are mutants ob- 
tained by a conventional mutagenic procedure (W. Leuchtenberger, Amino Acids - Technical Production and Use. In: 
Roehr (ed) Biotechnology, second edition, vol. 6, products of primary metabolism. VCH Vert agsgesellsc haft mbH, Wein- 
heim, P465 (1996)). 

[0143] Although mutagenesis methods have largely contributed to the progress of the fermentation industry, they 
suffer from a serious problem of multiple, random introduction of mutations into every part of the chromosom . Since 
many mutations are accumulated in a single chromosome each time a strain is improved, a production strain obtained 
by the random mutation and selecting is gen rally inferior in properties (for xampte, showing poor growth, delayed 
consumption of saccharides, and poor resistance to stresses such as temperature and oxygen) to a wild type strain, 
which brings about troubles such as failing to establish a sufficiently elevated productivity, being frequently contami- 
nated with miscellaneous bacteria, requiring troublesome procedures in culture maintenanc t and the like, and, in its 
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turn, levating the production cost in practice. In addition, the improvem nt in the productivity is based on random 
mutations and thus the mechanism thereof is unclear. Therefore, it is very difficult to plan a rational bre ding strategy 
for the subsequent improvement in the productivity. 

[0144] According to the present invention, effective mutation points contributing to the production can be efficiently 
5 specified from among many mutation points accumulat d in the chromosome of a production strain which has been 
bred from coryneform bacteria and : therefore, a novel breeding method of assembling these effective mutations in the 
coryneform bacteria can be established. Thus, a useful production strain can be reconstructed. It is also possible to 
construct a useful production strain from a wild type strain. 
[0145] Specifically, a useful mutant can be constructed in the following manner. 
w [0146] One of the mutation points is incorporated into a wild type strain of coryneform bacteria. Then, it is examined 
whether or not a positive effect is established on the production. When a positive effect is obtained, the mutation point 
is saved. When no effect is obtained, the mutation point is removed. Subsequently, only a strain having the effective 
mutation point is used as the parent strain, and the same procedure is repeated. In general, the effectiveness of a 
mutation positioned upstream cannot be dearly evaluated in some cases when there is a rate-determining point in the 
15 downstream of a biosynthesis pathway. It is therefore preferred to successively evaluate mutation points upward from 
downstream. 

[0147] By reconstituting effective mutations by the method as described above in a wild type strain or a strain which 
has a high growth speed or the same ability to consume saccharides as the wild type strain, It is possible to construct 
an industrially advantageous strain which is free or troubles in the previous methods as described above and lo conduct 

20 fermentation production using such strains within a short time or at a higher temperature. 

[0148] For example, a lysine-producing mutant B-6 (AppL Microbiol BiotechnoL, 32: 262-273 (1 989)), which is ob- 
tained by multiple rounds of random mutagenesis from a wild type strain Corynebacterium giutamicum ATCC 13032, 
enables lysine fermentation to be performed at a temperature between 30 and 34°C but shows lowered growth and 
lysine productivity at a temperature exceeding 34°C. Therefore, the fermentation temperature should be maintained 

25 at 34°C or lower. In contrast thereto, the production strain described in the above item 5, which is obtained by recon- 
stituting effective mutations relating to lysine production, can achieve a productivity at 40 to 42°C equal or superior to 
the result obtained by culturing at 30 to 34°C. Therefore, this strain is industrially advantageous since it can save th 
load of cooling during the fermentation. 

[0149] When culture should be carried out at a high temperature exceeding 43°C, a production strain capable of 
30 conducting fermentation production at a high temperature exceeding 43°C can be obtained by reconstituting useful 
mutations in a microorganism belonging to the genus Corynebacterium which can grow at high temperature exceeding 
43°C. Examples of the microorganism capable of growing at a high temperature exceeding 43°C include Corynebac- 
terium thermoaminogenes, such as Corynebacterium thermoaminogenes FERM 9244, FERM 9245, FERM 9246 and 
FERM 9247. 

35 [0150] A strain having a further improved productivity of the target product can be obtained using the thus rec n- 
structed strain as the parent strain and further breeding it using the conventional mutagenesis method, the gene am- 
plification method, the gene replacement method using the recombinant DNA technique, the transduction method or 
the cell fusion method. Accordingly, the microorganism of the present invention includes, but is not limited to, a mutant, 
a cell fusion strain, a transformant, a transductant or a recombinant strain constructed by using recombinant DNA 

40 techniques, so long as it is a producing strain obtained via the step of accumulating at least two effective mutations in 
a coryneform bacteria in the course of breeding. 

[0151] When a mutation point judged as being harmful to the growth or production is specified, on the other hand, 
it is examined whether or not the producing strain used at present contains the mutation point. When it has the mutation , 
it can be returned to the wild type gene and thus a further useful production strain can be bred. 
^5 [0152] The breeding method as described above is applicable to microorganisms, other than coryneform bact ria, 
which have industrially advantageous properties (for example, microorganisms capable of quickly utilizing less expen- 
sive carbon sources, microorganisms capable of growing at higher temperatures). 

7. Production and utilization of polynucleotide array 

so 

(1) Production of polynucleotide array 

[0153] A polynucleotide array can be produced using the polynucleotide or oligonucleotide of the present invention 
obtain d in the abov it ms 1 and 2. 
55 [0154] Examples include a polynucleotid array comprising a solid support to which at least one fap lynucleotid 
comprising the nude tid s quence represented by SEQ ID NOS:2 to 3501 , a polynucleotid which hybridizes with 
the polynucleotide under stringent conditions, and a polynucleotide comprising 10 to 200 continuous nucleotides in 
the nucleotide sequence of the polynucleotide is adhered; and a polynucleotide array comprising a solid support to 
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which at least one of a polynucleotide encoding a polypeptide comprising the amino acid sequence represent d by 
any one of SEQ ID NOS:3502 to 7001, a polynucleotide which hybridizes with the polynucleotide under stringent 
conditions, and a polynucleotide comprising 10 to 200 continuous bases in the nucleotide sequences of the polynu- 
cleotides is adhered. 

[0155] Polynucleotide arrays of the present invention include substrates known in the art, such as a DNA chip, a 
DNA microarray and a DNA macroarray, and the like, and comprises a solid support and plural polynucleotides or 
fragments thereof which are adhered to the surface of the solid support. 
[0156] Examples of the solid support include a glass plate, a nylon membrane, and the like. 

[0157] The polynucleotides or fragments thereof adhered to the surface of the solid support can be adhered to the 
surface of the solid support using the general technique for preparing arrays. Namely, a method in which they are 
adhered to a chemically surface-treated solid support, for example, to which a polycation such as polylysine or the like 
has been adhered (Nat Genet, 21: 1 5-1 9 (1 999)). The chemically surface-treated supports are commercially available 
and the commercially available solid product can be used as the solid support of the polynucleotide array according 
to the present invention. 

[0158] As the polynucleotides or oligonucleotides adhered to the solid support, the polynucleotides and oligonucle- 
otides of the present invention obtained in the above items 1 and 2 can be used. 

[01 59] The analysis described below can be efficiently performed by adhering the polynucleotides or oligonucleotides 
to the solid support at a high density, though a high fixation density is not always necessary. 

[0160] Apparatus lor achieving a high fixation density, such as an arrayer robot or the like, is commercially available 
from Takara Shuzo (GMS41 7 Arrayer), and the commercially available product can be used. 

[0161] Also, the oligonucleotides of the present invention can be synthesized directly on the solid support by the 
photolithography method or the like (Nat Genet, 21: 20-24 (1999)). In this method, a linker having a protective group 
which can be removed by light irradiation is first adhered to a solid support, such as a slide glass or the lik . Then, it 
is irradiated with light through a mask (a photolithograph mask) permeating light exclusively at a definite part of the 
adhesion part. Next, an oligonucleotide having a protective group which can be removed by light irradiation is added 
to the part. Thus, a ligation reaction with the nucleotide arises exclusively at the irradiated part. By repeating this 
procedure, oligonucleotides, each having a desired sequence, different from each other can be synthesized in respec- 
tive parts. Usually, the oligonucleotides to be synthesized have a length of 10 to 30 nucleotides. 

so (2) Use of polynucleotide array 

[0162] The following procedures (a) and (b) can be carried out using the polynucleotide array prepared in the abov 

(a) Identification of mutation point of coryneform bacterium mutant and analysis of expression amount and expression 
profile of gene encoded by genome 
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profile of gene encoded by genome 

[0163] By subjecting a gene derived from a mutant of coryneform bacteria or an examined gene to the following 
steps (i) to (iv), the mutation point of the gene can be identified or the expression amount and expression profit of th 
4 <> gene can be analyzed: 

(i) producing a polynucleotide array by the method of the above (1); 

(ii) incubating polynucleotides immobilized on the polynucleotide array together with the labeled gene derived from 
a mutant of the coryneform bacterium using the polynucleotide array produced in the above (i) under hybridization 

45 conditions; 

(Hi) detecting the hybridization; and 
(rv) analyzing the hybridization data. 



[0164] The gene derived from a mutant of coryneform bacteria or the examined gene include a gene relating to 
biosynthesis of at least one selected from amino acids, nucleic acids, vitamins, saccharides, organic acids, and ana- 
logues thereof. 

[0165] The method will be described in detail. 

[0166] A single nucleotide polymorphism (SNP) in a human region of 2,300 kb has been identified using polynucle- 
otide arrays (Science, 280. 1 077-82 (1 998)). In accordance with the method of identifying SNP and methods described 
in Science. 278. 680-686 (1997); Proc. NatL Acad. Sd. USA. 9G. 12833-38 (1 999); Science, 284: 1520-23 (1999), and 
the like using the polynucleotide array produced in the above (1 ) and a nucleic acid molecule (DNA, RNA) derived from 
coryneform bacteria in the method f the hybridation, a mutation point of a useful mutant, which is useful in producing 
an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, or the like can be identified and the gen 
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expr ssion amount and the expression profit ther of can be analy7ed. 

[0167] The nucleic acid molecule (DNA. RNA) derived from the coryneform bacteria can be obtained according to 
the general method described in Molecular Cloning, 2nd ed. orthe like. mRNA derived from Corynebactenum glutami- 
cum can also be obtained by the method of Bormann ef a/. (Molecular Microbiology. 6: 31 7-326 (1 992)) or the like. 
5 [0168] Although ribosomal RNA (rRNA) is usually obtained in large excess in addition to the target mRNA, the anal- 
ysis is not seriously disturbed thereby. 

[0169] The resulting nucleic acid molecule derived from coryneform bacteria is labeled. Labeling can be earned out 
according to a method using a fluorescent dye, a method using a radioisotope or the like. 

[0170] Specific examples include a labeling method in which psoralen-biotin is crosslinked with RNA extracted from 
w a microorganism and, after hybridization reaction, a fluorescent dye having streptoavidin bound thereto is bound to 
the biotin moiety (Nat. Biotech not, 16: 46-48 (1998)); a labeling method in which a reverse transcription reaction is 
carried out using RNA extracted from a microorganism as a template and random primers as primes and dUTP having 
a fluorescent dye (for example, Cy3, Cy5) (manufactured by Amersham Pharmacia Biotech) is incorporated into cDNA 
(Proc. Natl. Acad. Sci. USA, 9& 12833-38 (1999)); and the like. 
is [0171] The labeling specificity can be improved by replacing the random primers by sequences complementary to 
the 3'-end of ORF {J. BacterioL, 181: 6425-40 (1 999)). 

[0172] In the hybridization method, the hybridization and subsequent washing can be earned out by the gen rai 
method {Nat. BioctechnoL, 14: 1675-80 (1996), orthe like). 

[0173] Subsequently, the hybridization intensity is measured depending on the hybridization amounl of the nucl ic 
20 acid molecule used in the labeling. Thus, the mutation point can be identified and the expression amount of th gene 
can be calculated. 

[0174] The hybridization intensity can be measured by visualizing the fluorescent signal, radioactivity, luminesc nee 
dose, and the like, using a laser confocal microscope, a CCD camera, a radiation imaging device (for example, STORM 
manufactured by Amersham Pharmacia Biotech), and the like, and then quantifying the thus visualized data. 
25 [01 75] A polynucleotide array on a solid support can also be analyzed and quantified using a commercially available 
apparatus, such as GMS418 Array Scanner (manufactured by Takara Shuzo) orthe like. 

[01 76] The gene expression amount can be analyzed using a commercially available software (for example , ImaG n 
manufactured by Takara Shuzo; Array Gauge manufactured by Fuji Photo Film; ImageQuant manufactured by Amer- 
sham Pharmacia Biotech, orthe like). 
30 [0177] A fluctuation in the expression amount of a specific gene can be monitored using a nucleic acid molecule 
obtained in the time course of culture as the nucleic acid molecule derived from coryneform bactena. The culture 
conditions can be optimized by analyzing the fluctuation. 

[0178] The expression profile of the microorganism at the total gene level (namely, which genes among a great 
number of genes encoded by the genome have been expressed and the expression ratio thereof) can be det rmin d 
35 using a nucleic acid molecule having the sequences of many genes determined from the full genome sequence of th 
microorganism. Thus, the expression amount of the genes determined by the full genome sequence can be analyzed 
and, in its turn, the biological conditions of the microorganism can be recognized as the expression pattern at the full 
gene level. 

40 (b) Confirmation of the presence of gene homologous to examined gene in coryneform bacteria 

[0179] Whether or not a gene homologous to the examined gene, which is present in an organism other than co- 
ryneform bacteria, is present in coryneform bacteria can be detected using the polynucleotide array prepared in the 
above (1). 

45 [0180] This detection can be carried out by a method in whicn an examined gene which is present in an organism 
other than coryneform bacteria is used instead of the nucleic acid molecule derived from coryneform bactena used in 
the above identification/analysis method or (1). 

8. Recording medium storing full genome nucleotide sequence and ORF data and being readable by a computer and 
so methods for using the same 

[0181] The term "recording medium or storage device which is readable by a computer means a recording medium 
or storage medium which can be directly readout and accessed with a computer. Examples include magnetic rec rding 
media, such as a floppy disk, a hard disk, a magnetic tape, and the like; optical recording media, such as CD-ROM, 
55 CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, and the lik ; electric recording media, such as RAM, ROM, and the 
like; and hybrids in these categories (for exampl , magnetic/optical recording media, such as MO and the like). 
[0182] Instruments for recording or inputting in or on the recording medium or instruments or devices for reading out 
the information in the recording medium can be appropriately s lected, depending on the type of the recording medium 
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and the access device utili7ed. Also, various data processing programs ; software, comparator and formats are used 
for recording and utilizing the polynucleotide sequence information or the like, of the present invention in the recording 
medium. The information can be expressed in the form of a binary file, a text file or an ASCII file formatted with com- 
mercially available software, for example. Moreover, software for accessing the sequence information is available and 

5 known to one of ordinary skill in the art. 

[01 83] Examples of the information to be recorded in the above-described medium include the full genome nucleotide 
sequence information of coryneform bacteria as obtained in the above item 2, the nucleotide sequence information of 
ORF, the amino acid sequence information encoded by the ORF, and the functional information of polynucleotides 
coding for the amino acid sequences. 

10 [0184] The recording medium or storage device which is readable by a computer according to the present invention 
refers to a medium in which the information of the present invention has been recorded. Examples include recording 
media or storage devices which are readable by a computer storing the nucleotide sequence information represented 
by SEQ ID NOS:1 to 3501, the amino acid sequence information represented by SEQ ID NOS:3502 to 7001 1 the 
functional information of the nucleotide sequences represented by SEQ ID NOS:1 to 3501 , the functional information 

is of the amino acid sequences represented by SEQ ID NOS:3502 to 7001, and the information listed in Table 1 below 
and the like. 

9. System based on a computer using the recording medium of the present invention which is readable by a computer 

20 [0185] The term "system based on a computer" as used herein refers a system composed of hardware d vice(s), 
software device(s), and data recording device(s) which are used for analyzing the data recorded in the recording me- 
dium of the present invention which is readable by a computer. 

[01 86] The hardware device(s) are, for example, composed of an input unit, a data recording unit, a central processing 
unit and an output unit collectively or individually. 

25 [0187] By the software device(s), the data recorded in the recording medium of the present invention are search d 
or analyzed using the recorded data and the hardware device(s) as described herein. Specifically, the software devic 
(s) contain at least one program which acts on or with the system in order to screen, analyze or compare biologically 
meaningful structures or information from the nucleotide sequences, amino acid sequences and the like recorded in 
the recording medium according to the present invention. 

30 [0188] Examples of the software device(s) for identifying ORF and EMF domains include GeneMark (Nuc. Acids. 
Res,, 22: 4756-67 (1994)), GeneHacker (Protein, Nucleic Acid and Enzyme, 42: 3001-07 (1997)), Glimmer (The Insti- 
tute of Genomic Research; Nuc. Acids. Res., 26: 544-548 (1998)) and the like. In the process of using such a s ftware 
device, the default (initial setting) parameters are usually used, although the parameters can be changed, if necessary, 
in a manner known to one of ordinary skill in the art. 

35 [0189] Examples of the software device(s) for identifying a genome domain or a polypeptide domain analogous to 
the target sequence or the target structural motif (homology searching) include FASTA, BLAST, Smith-Waterman, 
GenetyxMac (manufactured by Software Development), GCG Package (manufactured by Genetic Computer Group), 
GenCore (manufactured by Compugen), and the like. In the process of using such a software device, the default (initial 
setting) parameters are usually used, although the parameters can be changed, if necessary, in a manner known to 

40 one of ordinary skill in the art. 

[0190] Such a recording medium storing the full genome sequence data is useful in preparing a polynucleotide array 
by which the expression amount of a gene encoded by the genome DNA of coryneform bacteria and the expression 
profile at the total gene level of the microorganism, namely, which genes among many genes encoded by the genome 
have been expressed and the expression ratio thereof, can be determined. 

is [0191] The data recording device(s) provided by the present invention are, for example, memory device(s) for re- 
cording the data recorded in the recording medium of the present Invention and target sequence or target structural 
motif data, or the like, and a memory accessing device(s) for accessing the same. 

[0192] Namely, the system based on a computer according to the present invention comprises the following: 

5 o (j) a user input device that inputs the information stored in the recording medium of the present invention, and 

target sequence or target structure motif information; 

(ii) a data storage device for at least temporarily storing the input information; 

(iii) a comparator that compares the information stored in the recording medium of the present invention with the 
target s quenc or target structure motif information, recorded by the data storing device of (ii) for screening and 

55 analyzing nucleotide sequence information which is coincident with or analogous to the target s que nee or target 

structure motif information; and 

(iv) an output device that shows a screening or analysing result obtained by the comparator. 
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[0193] This system is usable in the methods in items 2 to 5 as described above for searching and analy7ing the ORF 
and EMF domains, target sequence, target structural motif, etc. of a coryneform bacterium, searching homologs, 
searching and analyzing isozymes, determining the biosynthesis pathway and the signal transmission pathway, and 
identifying spots which have been found in the proteome analysis. The term -homologs" as used herein includes both 
5 of orthoiogs and paralogs. 

10. Production of polypeptide using ORF derived from coryneform bacteria 

[0194] The polypeptide of the present invention can be produced using a polynucleotide comprising the ORF obtained 
10 in the above item 2. Specifically, the polypeptide of the present invention can be produced by expressing the polynu- 
cleotide of the present invention or a fragment thereof in a host cell, using the method described in Molecular Cloning, 
2nd ed., Current Protocols in Molecular Biology, and the like, for example, according to the following method. 
[01 95] A DNA fragment having a suitable length containing a part encoding the polypeptide is prepared from the full 
length ORF sequence, if necessary. 
is [0196] Also, DNA in which nucleotides in a nucleotide sequence at a part encoding the polypeptide of the present 
invention are replaced to give a codon suitable tor expression of the host cell, if necessary. The DNA is useful tor 
efficiently producing the polypeptide of the present invention. 

[0197] A recombinant vector is prepared by inserting the DNA fragment into the downstream of a promoter in a 
suitable expression vector. 
20 [0198] The recombinant vector is introduced to a host cell suitable for the expression vector. 

[01 99] Any of bacteria, yeasts, animal cells, insect cells, plant cells, and the like can be used as the host cell so long 
as it can be expressed in the gene of interest. 

[0200] Examples of the expression vector include those which can replicate autonomously in the above-described 
host coll or can bo integrated into chromosome and have a promoter at such a position that the DNA encoding the 

25 polypeptide of the present invention can be transcribed. 

[0201] When a procaryote cell, such as a bacterium or the like, is used as the host cell, it is preferred that the 
recombinant vector containing the DNA encoding the polypeptide of the present invention can replicate autonomously 
in the bacterium and is a recombinant vector constituted by, at least a promoter, a ribosome binding sequence, the 
DNA of the present invention and a transcription termination sequence. A promoter controlling gene can also be con- 

30 tained therewith in operable combination. 

[0202] Examples of the expression vectors include a vector plasmid which is repiicable in Corynebacterium glutamh 
cum. such as pCGI (Japanese Published Unexamined Patent Application No. 134500782). pCG2 (Japanese Publish d 
Unexamined Patent Application No. 35197/83), pCG4 (Japanese Published Unexamined Patent Application No. 
183799/82), pCG11 (Japanese Published Unexamined Patent Application No. 134500/82), pCG116, pCE54 and 

35 P CB1 01 (Japanese Published Unexamined Patent Application No. 1 05999/83), pCE51 , pCE52 and pCE53 (Mol. Gen. 
Genet, 196: 175-1 78 (1984)), and the like; a vector plasmid which is repiicable in Escherichia coti, such as pET3 and 
pET11 (manufactured by Stratagene), pBAD, pThioHis and pTrcHis (manufactured by Invitrogen), pKK223-3 and 
pGEX2T (manufactured by Amersham Pharmacia Biotech), and the like; and pBTrp2, pBTad and pBTac2 (manufac- 
tured by Boehringer Mannheim Co.), pSE280 (manufactured by Invitrogen), pGEMEX-1 (manufactured by Promega), 

40 pQE-8 (manufactured by QIAGEN). pKYPIO (Japanese Published Unexamined Patent Application No. 110600/83) l 
pKYP200 {Agric. Biol. Chem., 4ft 669 (1984)), pLSA1 (Agric. Biol. Chem. t 53: 277 (1989)), pGEL1 (Ptvc. Natl. Acad. 
Set. USA, 82:. 4306 (1985)), pBluescript II SK(-) (manufactured by Stratagene), pTrs30 (prepared from Escherichia coll 
JM109/pTrS30 (FERM BP-5407)), pTrs32 (prepared from Escherichia coli JM109/pTrS32 (FERM BP-5408)), pGHA2 
(prepared from Escherichia coli IGHA2 (FERM B-400), Japanese Published Unexamined Patent Application N . 

45 221 091/85), pGKA2 (prepared from Escherichia co//IGKA2 (FERM BP-6798), Japanese Published Unexamln d Patent 
Application No. 221091/85), pTerm2 (U.S. Patents 4,686,191, 4,939,094 and 5,160,735). pSupex, pUB110, pTP5, 
pC1 94 and pEG400 {J. BaclerioL, 17Z. 2392 (1 990)), pGEX (manufactured by Pharmacia), pET system (manufactured 
by Novagen), and the like. 

[0203] Any promoter can be used so long as it can function in the host cell. Examples include promoters derived 
so from Escherichia coli, phage and the like, such as trp promoter (P trp ) , lac promoter, P L promoter, P R promot r, T7 

promoter and the like. Also, artificially designed and modified promoters, such as a promoter in which two Ptrp are 

linked in scries (P^x2) , fac promoter, /acT7 promoter tofl promoter and the like, can be used. 

[0204] It is preferred to use a plasmid in which the space between Shine-Dalgamo sequence which is the ribosome 

binding sepu nee and the initiation codon is adjusted to an appropriat distance (f r example, 6 to 18 nucleotides). 
55 [0205] The transcription termination s qu nee is not always necessary f rthe xpression of th DNA f the present 

invention . How ver, it is pref rred to arrange the transcription t rminating sequence at just downstream of the structural 

gene. 

[0206] On of ordinary skill in the art will appreciate that th codons of the above-described el ments may be opti- 
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mi7ed, in a known manner depending on the host cells and environmental conditions utilized. 

[0207] Examples of the host cell include microorganisms belonging to the genus Escherichia, the genus Serratia. 
the genus Bacillus, the genus Brevibacterium, the genus Corynebacterium, the genus Microbacterium, the genus Pseu- 
domonas, and the like. Specific examples include Escherichia cofi XL1-Blue, Escherichia co//XL2-Blue. Escherichia 
s coli DH1, Escherichia coli MC1000. Escherichia coli KY3276, Escherichia coli W1485, Escherichia coli JM109, Es- 
cherichia colt HB1 01, Escherichia coli No. 49, Escherichia co// W3 110, Escherichia coli NY 49, Escherichia coli GI698, 
Escherichia coli TBI , Serratia ficaria, Serratia fonticola, Serratia liquefaciens, Serratia marcescens, Bacillus subtilis, 
Bacillus amy loliquefadens, Corynebacterium ammonia genes, Brevibacterium immariophilum ATCC 14068, Brevibac- 
terium saccharolyticum ATCC 14066, Corynebacterium glutamicum ATCC 13032, Corynebacterium glutamicum ATCC 
f 0 1 3869, Corynebacterium glutamicum ATCC 1 4067 (prior genus and species: Brevibacterium flavum), Corynebacterium 
glutamicum ATCC 1 3869 (prior genus and species: Brevibacterium lactofermentum, or Corynebacterium lactofermen- 
tum) f Corynebacterium acetoacidophilum ATCC 13B70, Corynebacterium thermoaminogenes FERM 9244 : Microbac- 
terium ammoniaphilum ATCC 15354, Pseudomonas putida, Pseudomonas sp. D-0110, and the like. 
[0208] When Corynebacterium glutamicum or an analogous microorganism is used as a host, an EMF necessary 
75 for expressing the polypeptide is not always contained in the vector so long as the polynucleotide of the present in- 
vention contains an EMF. When the EMF is not contained in the polynucleotide, it is necessary to prepare the EMF 
separately and ligate it so as to be in operable combination. Also, when a higher expression amount or specific ex- 
pression regulation is necessary, it is necessary to ligate the EMF corresponding thereto so as to put the EMF in 
operable combination with the polynucleotide. Examples or using an externally ligated EMF are disclosed in Microbi- 

20 ology, 142. 1297-1309 (1996). 

[0209] With regard to the method for the introduction of the recombinant vector, any method for introducing DNA into 
the above-described host cells, such as a method in which a calcium ion is used (Proc. Natl. Acad, ScL USA, 69. 2110 
(1972)), a protoplast method (Japanese Published Unexamined Patent Application No. 2483942/88), the methods 
described in Gone, 17: 107 (1982) and Molecular & General Genetics, 168: 111 (1979) and the like, can be used. 

25 [0210] When yeast is used as the host cell, examples of the expression vector include pYES2 (manufactur d by 
Invitrogen), YEp13 (ATCC 37115), YEp24 (ATCC 37051), YCpSO (ATCC 37419), pHS19, pHS15, and the like. 
[0211] Any promoter can be used so long as it can be expressed in yeast. Examples include a promoter of a g ne 
in the glycolytic pathway, such as hexose kinase and the like, PHOS promoter, PGK promoter, GAP promoter, ADH 
promoter, gal 1 promoter, gat 10 promoter, a heat shock protein promoter, MF al promoter, CUP 1 promoter, and the like. 

30 [0212] Examples of the host cell include microorganisms belonging to the genus Saccharomyces, the genus 
Schizosaccharomyces, the genus Kluyveromyces, the genus Trichosporon. the genus Sch wanniomyces. th genus 
Pichia, the genus Candida and the like. Specific examples include Saccharomyces cerevisiae, Schizosaccharomyces 
pombe, Kluyveromyces tactis, Trichosporon pullulans, Schwanniomyces alluvius, Candida utilis and the like. 
[0213] With regard to the method for the introduction of the recombinant vector, any method for introducing DNA into 

35 yeast, such as an electroporation method (Methods. Enzymol., 194: 182 (1990)), a spheroplast method (Proc. NatL 
Acad. Sci. USA, 75: 1929 (1978)), a lithium acetate method (J. Bacterid, 153: 163 (1983)), a method described in 
Proc. Natl. Acad. Set. USA, 75: 1929 (1978) and the like, can be used. 

[0214] When animal cells are used as the host cells, examples of the expression vector include pcDNA3.1 . pSinRepS 
and pCEP4 (manufactured by Invitorogen), pRev-Tre (manufactured by Clontech), pAxCAwt (manufactured by Takara 
*o Shuzo), pcDNAI and pcDM8 (manufactured by Funakoshi), p AGE 107 (Japanese Published Unexamined Patent Ap- 
plication No. 22979/91 ; Cytotechnology, 5133 (1990)), pAS3-3 (Japanese Published Unexamined Patent Application 
No. 227075/90), pcDM8 (Nature, 329. 840 (1987)), pcDNAI/Amp (manufactured by Invitrogen), pREP4 (manufactured 
by Invitrogen), pAGE103 (J, Biochem., 101: 1307 (1987)), pAGE210, and the like. 

[0215] Any promoter can be used so long as it can function in animal celts. Examples include a promoter of IE 
*s (immediate earty) gene of cytomegalovirus (CMV), an early promoter of SV40, a promoter of retrovirus, a metal- 
lothionein promoter, a heat shock promoter, SRa promoter, and the like. Also, the enhancer of the IE gene of human 
CMV can be used together with the promoter. 

[0216] Examples of the host cell include human Namatwa cell, monkey COS cell, Chinese hamster CHO cell, 
HST5637 (Japanese Published Unexamined Patent Application No. 299/88), and the like. 
so [0217] The method for introduction of the recombinant vector into animal cells is not particularly limited, so long as 
it is the general method for introducing DNA into animal cells, such as an electroporation method (Cytotechnology, 3: 
133 (1990)), a calcium phosphate method (Japanese Published Unexamined Patent Application No. 227075/90), a 
lipofection method (Proc. Natl. Acad. Sci. USA, 84, 7413 (1987)), the method described in Virology, 52: 456 (1973), 
and the like. 

►5 [0218] Wh n insect cells are us d as the host cells, the polyp ptide can be expr ssed, for example, by th method 
described in Bacurovirus Expression Vectors, A Laboratory Manual, W.H. Freeman and Company, New Y rk (1992), 
Biotechnology, G. 47 (1988), or the like. 

[021 9] Specifically, a recombinant gene transfer vector and bacurovirus are simultaneously inserted into insect cells 
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to obtain a recombinant virus in an insect cell culture sup matant, and then th insect cells are infected with the resulting 
recombinant virus to express the polypeptide. 

[0220] Examples of the gene introducing vector used in the method include pBlueBac4.5 f pVL1392. pVL1393 and 
pBlueBactll (manufactured by Invitrogen). and the like. 
5 [0221] Examples of the bacurovirus include Autographa californica nuclear polyhedrosis virus with which insects of 
the family Barathra are infected, and the like. 

[0222] Examples of the insect cells include Spodoptera frugiperda oocytes Sf9 and Sf21 (Bacurovirus Expression 
Vectors, A Laboratory Manual, W.H. Freeman and Company, New York (1992)), Trichopiusia n/ oocyte High 5 (manu- 
factured by Invitrogen) and the like. 
10 [0223] The method for simultaneously incorporating the above-described recombinant gene transfer vector and the 
above-described bacurovirus for the preparation of the recombinant virus include calcium phosphate method (Japanese 
Published Unexamined Patent Application No. 227075/90), lipofection method (Proc. Natt. Acad. ScL USA, 84: 7413 
(1987)) and the like. 

[0224] When plant cells are used as the host cells, examples of expression vector include a 71 plasmid, a tobacco 
15 mosaic virus vector, and the like. 

[0225] Any promoter can be used so long as it can be expressed in plant cells. Examples include 35S promoter of 
cauliflower mosaic virus (CaMV), rice actin 1 promoter, and the like. 

[0226] Examples of the host cells include plant cells and the like, such as tobacco, potato, tomato, carrot, soybean, 
rape, airalfa, rice, wheal, barley, and the like. 

20 [0227] The method for introducing the recombinant vector is not particularly limited, so long as it is the general method 
for introducing DNA into plant cells, such as the Agrobacterium method (Japanese Published Unexamined Patent 
Application No. 140885/84, Japanese Published Unexamined Patent Application No. 70080/85, WO 94/00977), the 
electroporation method (Japanese Published Unexamined Patent Application No. 251 887/85), the particle gun method 
(Japanese Patents 2606856 and 2517813), and the like. 

25 [0228] The transformant of the present invention includes a transformant containing the polypeptide of the present 
invention per se rather than as a recombinant vector, that is, a transformant containing the polypeptide of the present 
invention which is integrated into a chromosome of the host, in addition to the transfomnant containing the abov 
recombinant vector. 

[0229] When expressed in yeasts, animal cells, insect cells or plant cells, a glycopolypeptide or glycosyiated porypep- 
30 tide can be obtained. 

[0230] The polypeptide can be produced by culturing the thus obtained transformant of the present inventi n in a 
culture medium to produce and accumulate the polypeptide of the present invention or any polypeptide expr ssed 
under the control of an EMF of the present invention, and recovering the polypeptide from the culture. 
[0231] Culturing of the transformant of the present invention in a culture medium is carried out according to the 
35 conventional method as used in culturing of the host 

[0232] When the transformant of the present invention is obtained using a prokaryote, such as Escherichia coli r 
the like, or a eukaryote, such as yeast or the like, as the host, the transformant is cultured. 

[0233] Any of a natural medium and a synthetic medium can be used, so long as it contains a carbon source, a 
nitrogen source, an inorganic salt and the like which can be assimilated by the transformant and can perform culturing 
40 of the transf ormant efficiently. 

[0234] Examples of the carbon source include those which can be assimilated by the transformant, such as carbo- 
hydrates (for example, glucose, fructose, sucrose, molasses containing them, starch, starch hydrolysate, and the like), 
organic acids (for example, acetic acid, propionic acid, and the like), and alcohols (for example, ethanol, propanol, and 
the like). 

[0235] Examples of the nitrogen source include ammonia, various ammonium salts of inorganic acids or organic 
acids (for example, ammonium chloride, ammonium sulfate, ammonium acetate, ammonium phosphate, and the like), 
other nilrogen-conlaining compounds, peptone, meat extract, yeasl extract, com sleep liquor, casein hydrolysate, soy- 
bean meal and soybean meal hydrolysate, various fermented cells and hydrolysates thereof, and the like. 
[0236] Examples of inorganic salt include potassium dihydrogen phosphate, dipotassium hydrogen phosphat , mag- 
so nesium phosphate, magnesium sulfate, sodium chloride, ferrous sulfate, manganese sulfate, copper sulfate, calcium 
carbonate, and the like. 

[0237] The culturing is carried out under aerobic conditions by shaking culture, submerged-aeration stirring culture 
or the tike. The culturing temperature is preferably from 15 to 40°C, and the culturing time is generally from 16 hours 
to 7 days. Th pH f the m dium is preferably maintain d at 3.0 to 9.0 during the culturing. Th pM can be adjusted 
55 using an inorganic r organic acid, an alkali solution, urea, calcium carbonat f ammonia, or the lik . 

[0238] Also, antibiotics, such as ampicillin, tetracycline, and the like, can be add d toth medium during the culturing, 
if necessary. 

[0239] When a microorganism transformed with a recombinant vector containing an inducible promoter is cultured. 
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an inducer can be added to th medium, if necessary. 

[0240] For example, isopropyl-p-D-thiogalactopyranoside (IPTG) or the like can be added to the medium when a 
microorganism transformed with a recombinant vector containing lac promoter is cultured, or indoleacrylic acid (IAA) 
or the like can by added thereto when a microorganism transformed with an expression vector containing trp promoter 
5 is cultured. 

[0241] Examples of the medium used in culturing a transformant obtained using animal cells as the host cells include 
RPM1 1 640 medium (The Journal of the American Medical Association, 199 51 9 (1 967)), Eagle's MEM medium {Sci- 
ence, 122:. 501 (1952)), Dulbecco's modified MEM medium (Virology, 8, 396 (1959)), 199 Medium (Proceeding of the 
Society for the Biological Medicine, 73A (1950)), the above-described media to which fetal calf serum has been add d, 
10 and the like. 

[0242] The culturing is carried out generally at a pH of 6 to 8 and a temperature of 30 to 40°C in the presence of 5% 
C0 2 for 1 to 7 days. 

[0243] Also, if necessary, antibiotics, such as kanamycin, penicillin, and the like, can be added to the medium during 
the culturing. 

is [0244] Examples of the medium used in culturing a transformant obtained using insect cells as the host cells include 
TNM-FH medium (manufactured by Pharmingen), Sf-900 II SFM (manufactured by Life Technologies), ExCell 400 and 
ExCell 405 (manufactured by JRH Biosciences), Grace's Insect Medium (Nature, 195: 788 (1962)), and the like. 
[0245] The culturing is carried out generally at a pH of 6 to 7 and a temperature of 25 to 30°C for 1 to 5 days. 
[0246] Additionally, antibiotics, such as genlamicin and the like, can be added lo the medium during the culturing, if 
necessary. 

[0247] A transformant obtained by using a plant cell as the host cell can be used as the cell or after differentiating 
to a plant cell or organ. Examples of the medium used in the culturing of the transformant include Murashige and Skoog 
(MS) medium, White medium, media to which a plant hormone, such as auxin, cytokinine, or the like has been added, 
and the like. 

25 [0248] The culturing is carried out generally at a pH of 5 to 9 and a temperature of 20 to 40°C for 3 to 60 days. 

[0249] Also, antibiotics, such as kanamycin, hygromycin and the like, can be added to the medium during the cul- 
turing, if necessary. 

[0250] As described above, the polypeptide can be produced by culturing a transformant derived from a microor- 
ganism, animal cell or plant cell containing a recombinant vector to which a DNA encoding the polypeptide of the 
30 present invention has been inserted according to the general culturing method to produce and accumulate the polypep- 
tide, and recovering the polypeptide from the culture. 

[0251] The process of gene expression may include secretion of the encoded protein production or fusion protein 
expression and the like in accordance with the methods described in Molecular Cloning, 2nd ed., in addition to direct 
expression. 

35 [0252] The method for producing the polypeptide of the present invention includes a method of intracellular expr s- 
sion in a host cell, a method of extracellular secretion from a host cell, or a method of production on a host cell membrane 
outer envelope. The method can be selected by changing the host cell employed or the structure of the polypeptide 
produced. 

[0253] When the polypeptide of the present invention is produced in a host cell or on a host cell membran outer 
40 envelope, the polypeptide can be positively secreted extracellularly according to, for example, the method of Paulson 
et al. (J. BioL Chem., 264: 17619 (1989)), the method of Lowe et al. (Proc. Natl. Acad. Set. USA t 86: 8227 (1989); 
Genes Develop., 4: 1288 (1990)), and/or the methods described in Japanese Published Unexamined Patent Application 
No. 336963/93, WO 94/23021 , and the like. 

[0254] Specifically, the polypeptide of the present invention can be positively secreted extracellularly by expressing 
it in the form that a signal peptide has been added to the foreground of a polypeptide containing an active site of the 
polypeptide of the present invention according to the recombinant DNA technique. 

[0255] Furthermore, the amount produced can be increased using a gene amplification system, such as by use of 
a dihydrofolate reductase gene or the like according to the method described in Japanese Published Unexamined 
Patent Application No. 227075/90. 
50 [0256] Moreover, the polypeptide of the present invention can be produced by a transgenic animal individual (trans- 
genic nonhuman animal) or plant individual (transgenic plant). 

[0257] When the transformant is the animal individual or plant individual, the polypeptide of the present invention 
can be produced by breeding or cultivating it so as to produce and accumulate the polypeptide, and recovering the 
polypeptide from the animal individual or plant individual. 
55 [0258] Examples of the method for producing the polypeptide of the present inv ntion using the animal individual 
include a method for producing the polypeptide of the present invention in an animal developed by inserting a gene 
according to methods known to those of ordinary skill in the art (American Journal of Clinical Nutrition, 63: 639S (1 996), 
American Journal of Clinical Nutrition, 63: 627S (1996), Bio/Technology, 9: 830 (1991)). 
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[0259] In th animal individual, the polypeptide can be produced by breeding a transgenic nonhuman animal to which 
the DNA encoding the polypeptide of the present invention has been inserted to produce and accumulate the polypep- 
tide in the animal, and recovering the polypeptide from the animaL Examples of the production and accumulation plac 
in the animal include milk (Japanese Published Unexamined Patent Application No. 309192/88). egg and the like of 
5 the animal. Any promoter can be used, so long as it can b expressed in the animal. Suitable examples include an a- 
casein promoter, a (^-casein promoter, a p-lactoglobulin promoter, a whey acidic protein promoter, and the like, which 
are specific for mammary glandular celts. 

[0260] Examples of the method for producing the polypeptide of the present invention using the plant individual 
include a method for producing the polypeptide of the present invention by cultivating a transgenic plant to which the 
10 DNA encoding the protein of the present invention by a known method (Tissue Culture, 20 (1 994), Tissue Culture, 21 
(1994), Trends in Biotechnology, 75:45 (1997)) to produce and accumulate the polypeptide in the plant, and recovering 
the polypeptide from the plant. 

[0261] The polypeptide according to the present invention can also be obtained by translation in vitro. 

[0262] The polypeptide of the present invention can be produced by a translation system in vitro. There are, for 

is example, two in vitro translation methods which may be used, namely, a method using RNA as a template and another 
method using DNA as a template. The template RNA includes the whole RNA, mRNA. an in vitro transcription product, 
and the like. The template DNA includes a plasmid containing a transcriptional promoter and a target gene int grated 
therein and downstream of the initiation site, a PCR/RT-PCR product and the like. To select the most suitable system 
for the in vitro translation, the origin or the gene encoding the protein to be synthesized (prokaryotic cell/eucaryotic 

20 cell), the type of the template (DNA/RNA), the purpose of using the synthesized protein and the like should be consid- 
ered. In vitro translation kits having various characteristics are commercially available from many companies (Boe- 
hringer Mannheim, Promega, Stratagene, or the like), and every kit can be used in producing the polypeptide according 
to the present invention. 

[0263] Transcription/translation of a DNA nucleotide sequence cloned into a plasmid containing a T7 promoter can 
25 be carried out using an in vitro transcription/translation system E. coliT7 S30 Extract System for Circular DNA (man- 
ufactured by Promega, catalogue No. L1130). Also, transcription/translation using, as a template, a linear prokaryotic 
DNA of a supercoil non-sensitive promoter, such as tecUV5, fac f XPL(con), XPL, or the like, can be carried out using 
an in vitro transcription/translation system E. coli S30 Extract System for Linear Templates (manufactured by Promega, 
catalogue No. L1030). Examples of the linear prokaryotic DNA used as a template include a DNA fragment, a PCR- 
30 amplified DNA product, a duplicated oligonucleotide ligation, an in vitro transcriptional RNA, a prokaryotic RNA, and 
the like. 

[0264] In addition to the production of the polypeptide according to the present invention, synthesis of a radioactiv 
labeled protein, confirmation of the expression capability of a cloned gene, analysis of the function of transcripti nal 
reaction or translation reaction, and the like can be carried out using this system. 

35 [0265] The polypeptide produced by the transformant of the present invention can be isolated and purified using the 
general method for isolating and purifying an enzyme. For example, when the polypeptide of the present inv ntion is 
expressed as a soluble product in the host cells, the cells are collected by centrifugation after cultivation, suspended 
in an aqueous buffer, and disrupted using an ultrasonicator, a French press, a Manton Gaulin homogenizer, a Dynomill, 
or the like to obtain a cell-free extract. From the supernatant obtained by centrifuging the cell-free extract, a purified 

40 product can be obtained by the general method used for isolating and purifying an enzyme, for example, solvent ex- 
traction, salting out using ammonium sulfate or the like, desalting, precipitation using an organic solvent anion ex- 
change chromatography using a resin, such as diethylaminoethyl (DEAE)-Sepharose. DIAION HPA-75 (manufactured 
by Mitsubishi Chemical) or the like, cation exchange chromatography using a resin, such as S-Sepharose FF (manu- 
factured by Pharmacia) or the like, hydrophobic chromatography using a resin, such as butyl sepharose, phenyl sepha- 

4* rose or the like, gel filtration using a molecular sieve, affinity chromatography, chromatofocusing, or electrophoresis, 
such as isoelectronic focusing or the like, alone or in combination thereof. 

[0266] When the polypeptide is expressed as an insoluble product in the host cells, the cells are collected in the 
same manner, disrupted and centrifuged to recover the insoluble product of the polypeptide as the precipitate fraction. 
Next, the insoluble product of the polypeptide is solubilized with a protein denaturing agent. The solubilized solution 
so is diluted or dialyzed to lower the concentration of the protein denaturing agent in the solution. Thus, the normal con- 
figuration of the polypeptide is reconstituted. After the procedure, a purified product of the polypeptide can be obtain d 
by a purification/isolation method similar to the above. 

[0267] When the polypeptide of the present invention or its derivative (for example, a polypeptide formed by adding 
a sugar chain thereto) is secreted out of cells, the polypeptid or its derivative can be collected in th culture supernatant. 
55 Namely, the cultur sup rnatant is obtained by treating the culture medium in a treatm nt similar t the abov (for 
xample. centrifugation). Then, a purif i d product can be obtained from the culture medium using a purification/is lation 
method similar to the above. 

[0266] The polypeptide obtained by the above method is within the scope of the polypeptid of the pres nt invention, 
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and examples include a polypeptide encoded by a polynucleotide comprising the nucteotid sequence sel cted from 
SEO ID NOS:2 to 3431, and a polypeptide comprising an amino acid sequence represented by any one of SEO ID 
NOS:3502 to 6931 . 

[0269] Furthermore, a polypeptide comprising an amino acid sequence in which at least one amino acids is deleted, 
replaced, inserted or added in the amino acid sequence of the polypeptide and having substantially the same activity 
as that of the polypeptide is included in the scope of the present invention. The term "substantially the same activity 
as that of the polypeptide" means the same activity represented by the inherent function, enzyme activity or the like 
possessed by the polypeptide which has not been deleted, replaced, inserted or added. The polypeptide can be ob- 
tained using a method for introducing part-specific mutation(s) described in, for example, Molecular Cloning, 2nd ed., 
Current Protocols in Molecular Biology, Nuc. Acids. Res., 10. 6487 (1 982) : Proc. Natl. Acad. Sci. USA, 79: 6409 (1 982), 
Gene, 34: 315 (1985), Nuc. Adds. Res., 73: 4431 (1985), Proc Natl. Acad. Sci. USA, 82. 488 (1985) and the like. For 
example, the polypeptide can be obtained by introducing mutation(s) to DNA encoding a polypeptide having the amino 
acid sequence represented by any one of SEQ ID NOS:3502 to 6931 . The number of the amino acids which are deleted, 
replaced, inserted or added is not particularly limited; however, it is usually 1 to the order of tens, preferably 1 to 20, 
more preferably 1 to 10, and most preferably 1 to 5, amino acids. 

[0270] The at least one amino acid deletion, replacement, insertion or addition in the amino acid sequence of the 
polypeptide of the present invention is used herein to referto that at least one amino acid is deleted, replaced, inserted 
or added to at one or plural positions in the amino acid sequence. The deletion, replacement, insertion or addition may 
be caused in the same amino acid sequence simultaneously. Also, the amino acid residue replaced, inserted or added 
can be natural or non-natural. Examples of the natural amino acid residue include L-atanine, L-asparagine, L-asparatic 
acid, L-glutamine : L-glutamic acid, glycine, L-histidine, L-isoleucine, L-leucine, L -lysine, L-methionine, L-phenylalanine, 
L-proline, L-serine, L-threonine, L-tryptophan, L-tyrosine, L-valine, L-cysteine, and the like. 

[0271] Herein, examples of amino acid residues which are replaced with each other are shown below. The amino 
acid residues in the same group can bo replaced with each other. 

Group A: 

[0272] leucine, isoleucine, norleucine, valine, norvaline, alanine, 2-aminobutanoic acid, methionine, O-rnethylserine, 
t-butylglycine, t-butylalanine, cyclohexylalanine; 

Group B: 

[0273] asparatic acid, glutamic acid, isoasparatic acid, isoglutamic acid, 2-aminoadipic acid, 2-aminosuberic acid; 
Group C: 

[0274] asparagine, glutamine; 
Group D: 

[0275] lysine, arginine, ornithine, 2,4-diaminobutanoic acid, 2,3-diaminopropionic acid; 
Group E: 

[0276] proline, 3-hydroxyproline, 4-hydroxyproline; 
Group F: 

[0277] serine, threonine, homoserine; 
Group G: 

[0278] phenylalanine, tyrosine. 

[0279] Also, in order that the resulting mutant polypeptide has substantially the same activity as that of the polypeptid 
which has not b en mutated, it is preferred that the mutant polypeptide has a homology of 60% or mor , preferably 
80% or more, and particularly preferably 95% or more, with the polypeptide which has not been mutated, when calcu- 
lated, for example, using default (initial setting) parameters by a homology searching software, such as BLAST, FASTA, 
or the like. 
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[0280] Also, the polypeptid of th present invention can be produced by a chemical synthesis method such as 
Fmoc (fluorenylmethyioxycarbonyl) method, tBoc (t-butyloxycarbonyl) method, or the like. It can also be synthesized 
using a peptide synthesizer manufactured by Advanced ChemTech, Perkin-Elmer Pharmacia, Protein Technology 
Instrument, Synthecell-Vega. PerSeptive, Shimadzu Corporation, or the like. 

[0281] The transformant of the present inv ntion can be used for objects other than the production of the polypeptide 
of the present invention. 

[0282] Specifically, at least one component selected from an amino acid, a nucleic acid, a vitamin, a saccharide, an 
organic acid ; and analogues thereof can be produced by culturing the transformant containing the polynucleotide or 
recombinant vector of the present invention in a medium to produce and accumulate at least one component selected 
from amino acids, nucleic acids, vitamins, saccharides, organic acids, and analogues thereof, and recovering the same 
from the medium. 

[0283] The biosynthesis pathways, decomposition pathways and regulatory mechanisms of physiologically active 
substances such as amino acids, nucleic acids, vitamins, saccharides, organic acids and analogues thereof differ from 
organism to organism. The productivity of such a physiologically active substance can be improved using these differ- 
ences, specifically by introducing a heterogeneous gene relating to the biosynthesis thereof. For example, the content 
of lysine, which is one of the essential amino acids, in a plant seed was improved by introducing a synthase g ne 
derived from a bacterium (WO 93/1 91 90). Also, arginine is excessively produced in a culture by introducing an arginine 
synthase gene derived from Escherichia coli (Japanese Examined Patent Publication 23750/93). 
[0284] To produce such a physiologically active substance, the Iransformanl according to the present invenli n can 
be cultured by the same method as employed in culturing the transformant for producing the polypeptide of the present 
invention as described above. Also : the physiologically active substance can be recovered from the culture medium 
in combination with, for example, the ion exchange resin method, the precipitation method and other known methods. 
[0285] Examples of methods known to one of ordinary skill in the art include electroporation, calcium transfection, 
the protoplast method, the method using a phage, and the like, when the host is a bacterium; and microinjection, 
calcium phosphate transfection, the positively charged lipid-mediated method and the method using a virus, and the 
like, when the host is a eukaryote (Molecular Cloning, 2nd ed.; Spector et al., Cetls/a laboratory manual, Cold Spring 
Harbour Laboratory Press, 1998)). Examples of the host include prokaryotes, lower eukaryotes (for example, y asts), 
higher eukaryotes (for example, mammals), and cells isolated therefrom. As the state of a recombinant polynucleotide 
fragment present in the host cells : it can be integrated into the chromosome of the host. Alternatively, it can be integrated 
into a factor (for example, a plasmid) having an independent replication unit outside the chromosome. These trans- 
formants are usable in producing the polypeptides of the present invention encoded by the ORF of the gen m of 
Corynebactehum glutamicum, the polynucleotides of the present invention and fragments thereof. Alternativ ly, they 
can be used in producing arbitrary polypeptides under the regulation by an EMF of the present invention. 

11 . Preparation of antibody recognizing the polypeptide of the present invention 

[0286] An antibody which recognizes the polypeptide of the present invention, such as a polyclonal antibody, a mon- 
oclonal antibody, or the like, can be produced using, as an antigen, a purified product of the polypeptide of the present 
invention or a partial fragment polypeptide of the polypeptide or a peptide having a partial amino acid sequence of the 
polypeptide of the present invention. 

(1) Production of polyclonal antibody 

[0287] A polyclonal antibody can be produced using, as an antigen, a purified product of the polypeptide f th 
present invention, a partial fragment polypeptide of the polypeptide, or a peptide having a partial amino acid sequ nee 
of the polypeptide of the present invention, and immunizing an animal with the same. 

[0288] Examples of the animal Lo be immunized include rabbits, goals, rats, mice, hamsters, chickens and Lhe lik . 
[0289] A dosage of the antigen is preferably 50 to 1 00 tig per animal. 

[0290] When the peptide is used as the antigen, it is preferably a peptide covalently bonded to a carrier protein, such 
as keyhole limpet haemocyanin, bovine thyroglobulin, or the like. The peptide used as the antigen can be synthesiz d 
by a peptide synthesizer. 

[0291] The administration of the antigen is, for example, carried out 3 to 10 times at the intervals of 1 or 2 weeks 
after the : first administration. On the 3rd to 7th day after each administration, a blood sample is collected fr m th 
venous plexus of the eyeground, and it is conf irmed that the serum reacts with the antigen by the enzyme immun assay 
(Enzyme-linked Immunosorbent Assay (ELISA), Igaku Shoin (1976) ; Antibodies - A Laboratory Manual, Cold Spring 
Harbor Laboratory (1988)) or the like. 

[0292] Serum is obtained from the immunised non-human mammal with a sufficient antibody titer against the antigen 
used for the immunization, and th serum is isolated and purified to obtain a polyclonal antibody. 
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[0293] Exampl s of the meth d for the isolation and purification include centrifugation, satting out by 40-50% satu- 
rated ammonium sulfate, capryiic acid precipitation (Antibodies, A Laboratory manual, Cold Spring Harbor Laboratory 
(1988)) f or chromatography using a DEAE-Sepharose column, an anion exchange column, a protein A- orG-column. 
a gel filtration column, and the like, alone or in combination thereof, by methods known to those of ordinary skill in the art. 

5 

(2) Production of monoclonal antibody 

(a) Preparation of antibody-producing cell 

10 [0294] A rat having a serum showing an enough antibody titer against a partial fragment polypeptide of the polypep- 
tide of the present invention used for immunization is used as a supply source of an antibody-producing cell. 
[0295] On the 3rd to 7th day after the antigen substance is finally administered the rat showing the antibody titer, the 
spleen is excised. 

[0296] The spleen is cut to pieces in MEM medium (manufactured by Nissui Pharmaceutical), loosened using a pair 
is of forceps, followed by centrifugation at 1 ,200 rpm for 5 minutes, and the resulting supernatant is discarded. 

[0297] The spleen in the precipitated fraction is treated with a Tris-ammonium chloride buffer (pH 7.65) for 1 to 2 
minutes to eliminate erythrocytes and washed three times with MEM medium, and the resulting spleen cells are used 
as antibody-producing cells. 

20 (b) Preparation of myeloma cells 

[0298] As myeloma cells, an established cell line obtained from mouse or rat is used. Examples of useful cell lines 
include those derived from a mouse, such as P3-X63Ag8-U1 (hereinafter referred to as "P3-U1 ") (Curr. Topics in Micro- 
biol. Immunol., 81: 1 (1978); Europ. J. Immunol., 6: 511 (1976)); SP2/0-Agl4 (SP-2) {Nature, 276: 269 (1978)): 

25 P3-X63-Ag8653 (653) (J. Immunol, 123: 1548 (1979)); P3-X63-Ag8 (X63) cell line (Nature, 256: 495 (1975)), and th 
like, which are 8-azaguanine-resistant mouse (BALB/c) myeloma cell lines. These cell lines are subcultured in 8-aza- 
guanine medium (medium in which, to a medium obtained by adding 1.5 mmol/l glutamine, 5 v10~ 5 mol/l 2-mercap- 
toethanol, 1 0 u,g/ml gentamicin and 1 0% fetal calf serum (FCS) (manufactured by CSL) to RPMI-1 640 medium (here- 
inafter referred to as the "normal medium"), 8-azaguanine is further added at 15 u.g/ml) and cultured in the normal 

30 medium 3 or 4 days before cell fusion, and 2x 1 0 7 or more of the cells are used for the fusion. 

(c) Production of hybridoma 

[0299] The antfoody-producing cells obtained in (a) and the myeloma cells obtained in (b) are washed with MEM 
35 medium or PBS (disodium hydrogen phosphate: 1 .83 g, sodium dihydrogen phosphate: 0.21 g, sodium chloride: 7.65 
g, distilled water 1 liter, pH: 7.2) and mixed to give a ratio of antibody-producing cells : myeloma cells = 5 : 1 to 10 : 
1 , followed by centrifugation at 1 ,200 rpm for 5 minutes, and the supernatant is discarded. 

[0300] The cells in the resulting precipitated fraction were thoroughly loosened, 0.2 to 1 ml of a mixed soluti n f 2 
g of polyethylene glycol-1000 (PEG-1000), 2 ml of MEM medium and 0.7 ml of dimethylsu If oxide (DMSO) per 10 s 
40 antibody-producing cells is added to the ceils under stirring at 37°C : and then 1 to 2 ml of MEM medium is furth r 
added thereto several times at 1 to 2 minute intervals. 

[0301] After the addition, MEM medium is added to give a total amount of 50 ml. The resulting prepared solution is 
centrifuged at 900 rpm for 5 minutes, and then the supernatant is discarded. The cells in the resulting precipitated 
fraction were gently loosened and then gently suspended in 1 00 ml of HAT medium (the normal medium to which 1 0"* 
45 mol/l hypoxanthine, 1 .5xl0* 5 mol/l thymidine and 4xi0* 7 mol/l aminopterin have been added) by repeated drawing 
up into and discharging from a measuring pipette. 

[0302] The suspension is poured into a 96 well culture plate at 100 ui/well and cultured al 37°C Tor 7 to 14 days in 
a 5% C0 2 incubator. 

[0303] After culturing, a part of the culture supernatant is recovered, and a hybridoma which specifically reacts with 
50 a partial fragment polypeptide of the polypeptide of the present invention is selected according to the enzyme immu- 
noassay described in Antibodies, A Laboratory manual, Cold Spring Harbor Laboratory, Chapter 1 4 (1 998) and the like. 
[0304] A specific example of the enzyme immunoassay is described below. 

[0305] The partial fragment polypeptide of the polypeptide of the present invention used as the antigen in the immu- 
nization is spread on a suitable plate, is allowed to react with a hybridoma culturing supernatant or a purified antib dy 
55 obtain d in (d) described below as a first antibody, and is further allowed to react with an anti-rat or anti-m us immu- 
noglobulin antibody labeled with an enzyme, a chemical luminous substance, a radioactive substance or the like as a 
second anttoody for reaction suitabl for the labeled substance. A hybridoma which specifically reacts with the polypep- 
tide of th pr sent invention is selected as a hybridoma capable of producing a monoclonal antibody of th present 
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invention. 

[0306] Cloning is repeated using the hybridoma twice by limiting dilution analysis (HT medium (a medium in which 
aminopterin has been removed from HAT medium) is firstly used, and the normal medium is secondly used), and a 
hybridoma which is stable and contains a sufficient amount of antibody titer is selected as a hybridoma capable of 
5 producing a monoclonal antibody of the present invention. 

(d) Preparation of monoclonal antibody 

[0307] The monoclonal antibody-producing hybridoma cells obtained in (c) are injected intraperitoneal^ into 8- to 
10 10-week-old mice or nude mice treated with pristane (intraperitoneal administration of 0.5 m! of 2,6,1 0,14-tetrameth- 
ylpentadecane (pristane), followed by 2 weeks of feeding) at 5x10 6 to 20x10 s cells/animal. The hybridoma causes 
ascites tumor in 10 to 21 days. 

[0308] The ascitic fluid is collected from the mice or nude mice, and centrif uged to remove solid contents at 3000 
rpm for 5 minutes. 

is [0309] A monoclonal antibody can be purified and isolated from the resulting supernatant according to the method 
similar to that used in the polyclonal antibody. 

[0310] The subclass of the antibody can be determined using a mouse monoclonal antibody typing kit or a rat mon- 
oclonal antibody typing kit. The polypeptide amount can be determined by the Lowry method or by calculation based 
on the absorbance at 280 nm. 

20 [0311] The antibody obtained in the above is within the scope of the antibody of the present invention. 

[0312] The antibody can be used for the general assay using an antibody, such as a radioactive material label d 
immunoassay (RIA), competitive binding assay, an immunotissue chemical staining method (ABC method, CSA meth- 
od, etc.), immunopreciprtation, Western blotting, ELISA assay, and the like (An introduction to Radioimmunoassay and 
Rotated Techniques, Elsevier Science (1986); Techniques in immunocytochcmistry t Academic Press, Vol. 1 (1982), 

25 Vol. 2 (1 983) & Vol. 3 (1 986); Practice and Theory of Enzyme Immunoassays, Elsevier Science (1 985); Enzyme-linked 
Immunosorbent Assay (EUSA), Igaku Shoin (1976) ; Antibodies - A Laboratory Manual, Cold Spring Harbor laboratory 
(1 988); Monoclonal Antibody Experiment Manual, Kodansha Scientific (1 987); Second Series Biochemical Experiment 
Course, Vol. 5, Immunobiochemistry Research Method, Tokyo Kagaku Dojin (1986)). 
[0313] The antibody of the present invention can be used as it is or after being labeled with a label. 

30 [0314] Examples of the label include radioisotope, an affinity label (e.g., biotin, avidin, or the like), an enzyme label 
(e.g., horseradish peroxidase, alkaline phosphatase, or the like), a fluorescence label (e.g., FITC, rhodamine, or th 
like), a label using a rhodamine atom, (J. Histochem. Cytochem.. 18: 315 (1970); Meth. Enzym.. 62. 308 (1979); Im- 
munol., 109: 129 (1972); J. Immunol., Meth., 13 215 (1979)), and the like. 

[031 5] Expression of the polypeptide of the present invention, fluctuation of the expression, the presence or absence 
35 of structural change of the polypeptide, and the presence or absence in an organism other than corynefomn bact ria 
of a polypeptide corresponding to the polypeptide can be analyzed using the antibody or the labeled antibody by the 
above assay, or a polypeptide array or proteome analysis described below. 

[0316] Furthermore, the polypeptide recognized by the antibody can be purified by immunoaffinity chromatography 
using the antibody of the present invention. 

40 

12. Production and use of polypeptide array 

(1 ) Production of polypeptide array 

is [0317] A polypeptide array can be produced using the polypeptide of the present invention obtained in the ab ve 
item 10 or the antibody of the present invention obtained in the above Item 11 . 

[0318] The polypeptide array of the present invention includes protein chips, and comprises a solid support and the 
polypeptide or antibody of the present invention adhered to the surface of the solid support. 

[0319] Examples of the solid support include plastic such as polycarbonate or the like; an acrylic resin, such as 
so polyacrylamide or the like; complex carbohydrates, such as agarose, sepharose, or the like; silica; a silica-bas d ma- 
terial, carbon, a metal, inorganic glass, latex beads, and the like. 

[0320] The polypeptides or antibodies according to the present invention can be adhered to the surface of the solid 
support according to the method described in Biotechniques, 27: 1258-61 (1999); Molecular Medicine Today. 5: 326-7 
(1 999); Handbook of Experimental Immunology, 4th dition, Blackwell Scientific Publicati ns, Chapter 1 0 (1 986); Meth. 
55 Enzym., 34 (1974); Advances in Experimental Medicine and Biology, 42 (1974); U.S. Patent 4,681 ,870; U.S. Patent 
4,282,287; U.S. Patent 4,762,881 . or the like. 

[0321] The analysis described herein can be efficiently p rformed by adhering the polypeptide or antibody of the 
present invention to th solid support at a high density, though a high fixation density is not always necessary. 
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(2) Use of polypeptide array 

[0322] A polypeptide or a compound capable of binding to and interacting with the polypeptides of the present in- 
vention adhered to the array can be identified using the polypeptide array to which the polypeptides of the present 
5 invention have been adhered thereto as described in the above (1). 

[0323] Specifically, a polypeptide or a compound capable of binding to and interacting with the polypeptides of the 
present invention can be identified by subjecting the polypeptides of the present invention to the following steps (i) to (iv) : 

(i) preparing a polypeptide array having the polypeptide of the present invention adhered thereto by the method 
io of the above (1); 

(ii) incubating the polypeptide immobilized on the polypeptide array together with at least one of a second polypep- 
tide or compound; 

(iii) detecting any complex formed between the at least one of a second polypeptide or compound and the polypep- 
tide immobilized on the array using, for example, a label bound to the at least one of a second polypeptide or 

'5 compound, or a secondary label which specifically binds to the complex or to a component of the complex after 

unbound material has been removed; and 

(iv) analyzing the detection data. 



25 



[0324] Specific examples of the polypeptide array lo which the polypeptide of the present invention has been adhered 
20 include a polypeptide array containing a solid support to which at least one of a polypeptide containing an amino acid 
sequence selected from SEO ID NOS:3502 to 7001, a polypeptide containing an amino acid sequence in which at 
least one amino acids is deleted, replaced, inserted or added in the amino acid sequence of the polypeptide and having 
substantially the same activity as that of the polypeptide, a polypeptide containing an amino acid sequence having a 
homology of 60% or more with the amino acid sequences of the polypeptide and having substantially the same activity 
as that of the polypeptides, a partial fragment polypeptide, and a peptide comprising an amino acid sequence of a part 
of a polypeptide. 

[0325] The amount of production of a polypeptide derived from coryneform bacteria can be analyzed using a polyp p- 
tide array to which the antibody of the present invention has been adhered in the above (1 ). 

[0326] Specifically, the expression amount of a gene derived from a mutant of coryneform bacteria can be anar/7ed 
30 by subjecting the gene to the following steps (i) to (iv): 

(i) preparing a polypeptide array by the method of the above (1 ); 

(ii) incubating the polypeptide array (the first antibody) together with a polypeptide derived from a mutant of co- 
ryneform bacteria; 

55 (iii) detecting the polypeptide bound to the polypeptide immobilized on the array using a labeled second antibody 

of the present invention; and 
(iv) analyzing the detection data. 
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[0327] Specific examples of the polypeptide array to which the antibody of the present invention is adhered includ 
a polypeptide array comprising a solid support to which at (east one of an antibody which recognizes a polypeptide 
comprising an amino acid sequence selected from SEQ ID NOS:3502 to 7001, a polypeptide comprising an amino 
acid sequence in which at least one amino acids is deleted, replaced, inserted or added in the amino acid sequ nee 
of the polypeptide and having substantially the same activity as that of the polypeptide, a polypeptide comprising an 
amino acid sequence having a homology of 60% or more with the amino acid sequences of the polypeptide and having 
43 substantially the same activity as that of the polypeptides, a partial fragment polypeptide, or a peptide comprising an 
amino acid sequence of a part of a polypeptide. 

[0328] A fluctuation in an expression amount of a specific polypeptide can be monitored using a polypeptide obtained 
in the time course of culture as the polypeptide derived from coryneform bacteria. The culturing conditions can be 
optimized by analyzing the fluctuation. 
so [0329] When a polypeptide derived from a mutant of coryneform bacteria is used, a mutated polypeptide can be 
detected. 



1 3. Identification of useful mutation in mutant by proteome analysis 

[0330] Usually, the proteome is used herein to ref r to a method wh rein a polyp ptide is separated by twodimen- 
sional electrophor sis and the separated polypeptide is dig st d with an enzyme, follow d by identification of th 
polypeptide using a mass spectrometer (MS) and searching a data base. 

[0331] The two dimensional electrophoresis means an electrophoretic method which is pert rmed by c mbining two 
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electrophoretic procedures having different principles. For example, polypeptides are separated dep nding on molec- 
ular weight in the primary electrophoresis. Next the gel is rotated by 90° or 180° and the secondary electrophoresis 
is carried out depending on isoelectric point. Thus, various separation patterns can be achieved (JIS K 3600 2474) 
[0332] In searching the data base, the amino acid sequence information of the polypeptides of the present invention 
5 and the recording medium of th present invention provide for in the above it ms 2 and 8 can be used. 

[0333] The proteome analysis of a coryneform bacterium and its mutant makes it possible to identify a polypeptide 
showing a fluctuation therebetween. 

[0334] The proteome analysis of a wild type strain of coryneform bacteria and a production strain showing an im- 
proved productivity of a target product makes it possible to efficiently identify a mutation protein which is useful in 

10 breeding for improving the productivity of a target product or a protein of which expression amount is fluctuated. 

[0335] Specifically a wild type strain of coryneform bacteria and a lysine-producing strain thereof are each subjected 
to the proteome analysis. Then, a spot increased in the lysine-producing strain, compared with the wild type strain, is 
found and a data base is searched so that a polypeptide showing an increase in yield in accordance with an increase 
in the lysine productivity can be identified. For example, as a result of the proteome analysis on a wild type strain and 

is a lysine-producing strain, the productivity of the catalase having the amino acid sequence represented by SEQ ID NO: 
3785 is increased in the lysine-producing mutant. 

[0336] As a result that a protein having a high expression level is identified by proteome analysis using the nucleotide 
sequence information and the amino acid sequence information, of the genome of the coryneform bacteria of the 
present invention, and a recording medium storing the sequences, the nucleotide sequence of the gene encoding this 
20 protein and the nucleotide sequence in the upstream thereof can be searched at the same time, and thus, a nucleotide 
sequence having a high expression promoter can be efficiently selected. 

[0337] In the proteome analysis, a spot on the two-dimentional electrophoresis gel showing a fluctuation is som tim s 
derived from a modified protein. However, the modified protein can be efficiently identified using the recording m dium 
storing the nucleotide sequence information, the amino acid sequence information, of the genome of coryneform bac- 

25 teria, and the recording medium storing the sequences, according to the present invention. 

[0338] Moreover, a useful mutation point in a useful mutant can be easily specified by searching a nucleotid se- 
quence (nucleotide sequence of promoters, ORF, or the like) relating to the thus identified protein using a recording 
medium storing the nucleotide sequence information and the amino acid sequence information, of the genome of 
coryneform bacteria of the present invention, and a recording medium storing the sequences and using a primer de- 

30 signed on the basis of the detected nucleotide sequence. As a result that the useful mutation point is specified, an 
industrially useful mutant having the useful mutation or other useful mutation derived therefrom can be easily bred. 
[0339] The present invention will be explained in detail below based on Examples. However, the present inv ntion 
is not limited thereto. 

35 Example 1 

Determination of the full nucleotide sequence of genome of Corynebacterium giutamicum 

[0340] The full nucleotide sequence of the genome of Corynebacterium giutamicum was determined based on the 
40 whole genome shotgun method (Science, 269-. 496-512 (1995)). In this method, a genome library was prepared and 
the terminal sequences were determined at random. Subsequently, these sequences were ligated on a comput r t 
cover the full genome. Specifically, the following procedure was carried out. 
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(1) Preparation of genome DNA of Corynebacterium giutamicum ATCC 13032 



[0341] Corynebacterium giutamicum ATCC 13032 was cultured in BY medium (7 gf\ meat extract, 1 0 g/l peptone, 3 
g/l sodium chloride, 5 g/l yeasi extract, pH 7.2) containing 1% of glycine at 30°C overnight and the cells were collect d 
by centrifugation. After washing with STE buffer (10.3% sucrose, 25 mmol/l Tris hydrochloride, 25 mmol/l EDTA, pH 
8.0), the cells were suspended in 10 ml of STE buffer containing 10 mg/mt lysozyme, followed by gently shaking at 

50 37°C for 1 hour. Then, 2 ml of 1 0% SDS was added thereto to lyse the cells, and the resultant mixture was maintained 
at 65°C for 1 0 minutes and then cooled to room temperature. Then, 1 0 ml of Tris-neutralized phenol was added thereto, 
followed by gently shaking at room temperature for 30 minutes and centrifugation (1 5,000 x g, 20 minutes, 20°C). The 
aqueous layer was separated and subjected to extraction with phenol/chloroform and extraction with chloroform (twice) 
in the same manner. To the aqueous lay r, 3 mol/l sodium acetate solution (pH 5.2) and isopropanol were add d at 

55 1/10 times volume and twice volume, respectively, followed by gently stirring to precipitate th genom DNA. Th 
genome DNA was dissolved again in 3 ml of TE buff r (1 0 mmol/l Tris hydrochloride, 1 mmol/l EDTA, pH 8.0) containing 
0.02 mg/ml of RNase and maintained at 37*C for 45 minutes. The extractions with phenol, phenol/chloroform and 
chloroform were carried out succ ssively in th sam manner as th above. The genome DNA was subject d to iso- 
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propanol precipitation. The thus formed genome DNA precipitate was washed with 70% ethanol three times, followed 
by air-drying, and dissolved in 1.25 ml of TE buffer to give a genome DNA solution (concentration: 0.1 mg/mlj. 

(2) Construction of a shotgun library 

[0342] TE buffer was added to 0.01 mg of the thus prepared genome DNA of Corynebactehum glutamicum ATCC 
13032 to give a total volume of 0.4 ml, and the mixture was treated with a sonicator (Yamato Powersonic Model 150) 
at an output of 20 continuously for 5 seconds to obtain fragments of 1 to 10 kb. The genome fragments were blunt- 
ended using a DNA blunting kit (manufactured by Takara Shuzo) and then fractionated by 6% polyacrylamide gel 
electrophoresis. Genome fragments of 1 to 2 kb were cut out from the gel, and 0.3 ml MG elution buffer (0.5 mol/l 
ammonium acetate, 1 0 mmol/l magnesium acetate, 1 mmot/i EDTA, 0.1% SDS) was added thereto, followed by shaking 
at 37°C overnightto elute DNA. The DNA eluate was treated with phenol/chloroform, and then precipitated with ethanol 
to obtain a genome library insert. The total insert and 500 ng of pUC1B Smat/BAP (manufactured by Amersham Phar- 
macia Biotech) were ligated at 1 6°C for 40 hours. 

[0343] The ligation product was precipitated with ethanol and dissolved in 0.01 ml of TE buffer. The ligation solution 
(0.001 ml) was introduced into 0.04 ml of E. co// ELECTRO MAX DH10B (manufactured by Life Technologies) by the 
electroporation under conditions according to the manufacture's instructions. The mixture was spread on LB plate 
medium (LB medium (10 c/1 bactotrypton, 5 g/l yeast extract, 10 g/l sodium chloride, pH 7.0) containing 1 .6% of agar) 
containing 0.1 mg/ml arnpicillin, 0.1 mg/ml X-gal and 1 mmol/l isopropyl-p-D-thiogalaclopyranoside (IPTG) and cultured 
at 37°C overnight. 

[0344] The transformant obtained from colonies formed on the plate medium was stationarily cultured in a 96-w II 
titer plate having 0.05 ml of LB medium containing 0.1 mg/ml arnpicillin at 37°C overnight. Then, 0.05 ml of LB medium 
containing 20% glycerol was added thereto, followed by stirring to obtain a glycerol stock. 

(3) Construction of cosmid library 

[0345] About 0.1 mg of the genome DNA of Corynebactehum glutamicum ATCC 13032 was partially digest d with 
Sau3AI (manufactured by Takara Shuzo) and then ultracentrifuged (26,000 rpm, 18 hours, 20°C) under 10 to 40% 
sucrose density gradient obtained using 10% and 40% sucrose buffers (1 mol/I NaCl, 20 mmol/l Tris hydrochloride, 5 
mmol/l EDTA, 10% or 40% sucrose, pH 8.0). After the centrifugation, the solution thus separated was fractionated into 
tubes at 1 ml in each tube. After confirming the DNA fragment length of each fraction by agarose gel electrophoresis, 
a fraction containing a large amount of DNA fragment of about 40 kb was precipitated with ethanol. 
[0346] The DNA fragment was ligated to the fiamHI site of superCosI (manufactured by Stratagene) in accordanc 
with the manufacture's instructions. The ligation product was incorporated into Escherichia colt XL-1 -BlueMR strain 
(manufactured by Stratagene) using Gigapack III Gold Packaging Extract (manufactured by Stratagene) in accordance 
with the manufacture's instructions. The Escherichia co// was spread on LB plate medium containing 0.1 mg/ml arnp- 
icillin and cultured therein at 37°C overnight to isolate colonies. The resulting colonies were stationarily cultured at 
37°C overnight in a 96-well titer plate containing 0.05 ml of the LB medium containing 0.1 mg/ml arnpicillin in ach 
well. LB medium containing 20% glycerol (0.05 ml) was added thereto, followed by stirring to obtain a glycerol stock. 

(4) Determination of nucleotide sequence 
(4-1) Preparation of template 

[0347] The full nucleotide sequence of Corynebacterium glutamicum ATCC 1 3032 was determined mainly based on 
the whole genome shotgun method. The template used in the whole genome shotgun method was prepared by the 
PCR method using the library prepared in the above (2). 

[0348] Specifically, the clone derived from the whole genome shotgun library was inoculated using a replicator (man- 
ufactured by GENETIX) into each well of a 96-well plate containing the LB medium containing 0.1 mg/ml of arnpicillin 
at 0.08 ml per each well and then stationarily cultured at 37°C overnight. 

[0349] Next, the culturing solution was transported using a copy plate (manufactured by Tokken) into a 96-w II re- 
action plate (manufactured by PE Biosystcms) containing a PCR reaction solution (TaKaRa Ex Taq (manufactured by 
Takara Shuzo)) at 0.08 ml per each well. Then, PCR was carried out in accordance with the protocol by Makino et al. 
(DNA Research, 5: 1-9 (1998)) using GeneAmp PCR System 9700 (manufactured by PE Biosystems) to amplify th 
inserted fragm nt. 

[0350] The excessive primers and nucleotides w re eliminated using a kit for purifying a PCR production (manufac- 
tured by Amersham Pharmacia Biotech) and the residue was used as the template in the sequencing reaction. 
[0351] Some nucleotide sequences were d tenmined using a double-stranded DNA plasmid as a template. 



1108790A2 I > 



33 



EP 1 108 790 A2 



[0352] Th double-stranded DNA plasmid as the template was obtained by th following method. 
[0353] The clone derived from the whole genome shotgun library was inoculated into a 24- or 96-weII plate containing 
a 2 x YT medium (1 6 g/l bactotrypton, 1 0 g/l yeast extract. 5 g/l sodium chloride, pH 7.0) containing 0.05 mg/mf ampicillm 
at 1 .5 ml per each well and then cultured under shaking at 37°C overnight. 
5 [0354] The double-stranded DNA plasmid was prepared from the culturing solution using an automatic plasmid pre- 
paring machine, KURABO PI-50 (manufactured by Kurabo Industries) or a multiscreen (manufactured by Millipore) in 
accordance with the protocol provided by the manufacturer. 

[0355] To purify the double-stranded DNA plasmid using the multiscreen, Biomek 2000 (manufactured by Beckman 
Coulter) or the like was employed. 
10 [0356] The thus obtained double-stranded DNA plasmid was dissolved in water to give a concentration of about 0.1 
mg/ml and used as the template in sequencing. 

(4-2) Sequencing reaction 

is [0357] To 6 fxl of a solution of ABI PRISM BigDye Terminator Cycle Sequencing Ready Reaction Kit (manufactured 
by PE Biosystems), an M13 regular direction primer (M13-21) or an M13 reverse direction primer (M13REV) (DAW 
Research, 5: 1-9 (1998) and the template prepared In the above (4-1) (the PCR product or the plasmid) were added 
to give 10 uJ of a sequencing reaction solution. The primers and the templates were used in an amount of 1.6 pmol 
and an amount of 50 lo 200 ng, respectively. 

20 [0358] Dye terminator sequencing reaction of 45 cycles was carried out with GeneAmp PCR System 9700 (manu- 
factured by PE Biosystems) using the reaction solution. The cycle parameter was determined in accordance with the 
manufacturer's instruction accompanying ABI PRISM BigDye Terminator Cycle Sequencing Ready Reaction Kit. The 
sample was purified using Multiscreen HV plate (manufactured by Millipore) according to the manufacture's instruc- 
tions. The thus purified reaction product was precipitated with cthanol, followed by drying, and then stored in the dark 

25 at -30°C. 

[0359] The dry reaction product was analyzed by ABI PRISM 377 DNA Sequencer and ABI PRISM 3700 DNA An- 
alyzer (both manufactured by PE Biosystems) each in accordance with the manufacture's instructions. 
[0360] The data of about 50,000 sequences in total (i.e., about 42,000 sequences obtained using 377 DNA S quenc- 
er and about 8,000 reactions obtained by 3700 DNA Analyser) were transferred to a server (Alpha Server 41 00: man- 
30 ufactured by COMPAQ) and stored. The data of these about 50,000 sequences corresponded to 6 times as much as 
the genome size. 

(5) Assembly 

35 [0361] All operations were carried out on the basis of UNIX platform. The analytical data were output in Macintosh 
platform using X Window System. The base call was carried out using phred (The University of Washington). Th 
vector sequence data was deleted using SPS Cross_Match (manufactured by Southwest Parallel Software). The as- 
sembly was carried out using SPS phrap (manufactured by Southwest Parallel Software; a high-speed version of phrap 
(The University of Washington)). The contig obtained by the assembly was analyzed using a graphical editor, consed 

40 (The University of Washington). A series of the operations from the base call to the assembly were carried out simul- 
taneously using a script phredPhrap attached to consed. 

(6) Determination of nucleotide sequence in gap part 

*s [0362] Each cosmid in the cosmid Itorary constructed in the above (3) was prepared by a method similar to the 
preparation of the double-stranded DNA plasmid described in the above (4-1 ). The nucleotide sequence at th end of 
the inserted fragment of the cosmid was determined by using ABI PRISM BigDye Terminator Cycle Sequencing Ready 
Reaction Kit (manufactured by PE Biosystems) according to the manufacture's instructions. 

[0363] About BOO cosmid clones were sequenced at both ends to search a nucleotide sequence in the contig d rived 
so from the shotgun sequencing obtained in the above (5) coincident with the sequence. Thus, the linkage between re- 
spective cosmid clones and respective contigs were determined and mutual alignment was carried out. Furthermore, 
the results were compared with the physical map of Coryncbactcrium gtutamicum ATCC 1 3032 (MoL Gen. Genet, 
252: 255-265 (1996) to carrying out mapping between the cosmids and the contigs. 

[0364] The sequence in th r gion which was not covered with the contigs was d terminedbyth following meth d. 
55 [0365] Clon s containing sequences position d at th nds of contigs w re s lect d. Among these clones, about 
1 ,000 clones wherein onty ne end of the inserted fragment had been determined were selected and th sequence at 
the opposite end of the inserted fragment was determined. A shotgun library clone or a cosmid clone containing the 
sequences at the respective nds of the insert d fragm nt in two contigs was id ntified, the full nucleotide sequence 
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of the inserted fragment of this clone was determined, and thus the nucleotide sequ nee of the gap part was det rmined 
When no shotgun library clone or cos mid clone covering the gap part was available, primers complementary to the 
end sequences at the two contigs were prepared and the DNA fragment in the gap part was amplified by PCR. Then, 
sequencing was performed by the primer walking method using the amplified DNA fragment as a template or by the 
5 shotgun method in which the sequence of a shotgun clone prepared from the amplified DNA fragment was determined. 
Thus, the nucleotide sequence of the domain was determined. 

[0366] In a region showing a low sequence precision, primers were synthesized using AUTOFINISH function and 
NAVIGATING function of consed (The University of Washington) and the sequence was determined by the primer 
walking method to improve the sequence precision. The thus determined full nucleotide sequence of the genome of 
io Corynebacierium glutamicum ATCC 13032 strain is shown in SEQ ID NO:1 . 

(7) Identification of ORF and presumption of its function 

[0367] ORFs in the nucleotide sequence represented by SEQ ID NO:1 were identified according to the following 
is method. First, the ORF regions were determined using software for identifying ORF, i.e., Glimmer, GeneMark and 
GeneMark.hmm on UNIX platform according to the respective manual attached to the software. 
[0368] Based on the data thus obtained, ORFs in the nucleotide sequence represented by SEQ ID NO:1 were iden- 
tified. 

[0369] The putative function of an ORF was determined by searching the homology of the identified amino acid 
20 sequence of the ORF against an amino acid database consisting of protein-encoding domains derived from Swiss- 
Prot, PIR or Genpept database constituted by protein encoding domains derived from GenBank database, Frame 
Search (manufactured by Compugen), or by searching the homology of the identified amino acid sequence of the ORF 
against an amino acid database consisting of protein -encoding domains derived from Swiss-Prot, PIR or Genpept 
database constituted by protein encoding domains derived from GenBank database, BLAST. The nucleotide sequences 
25 of the thus determined ORFs are shown in SEQ ID NOS:2 to 3501 , and the amino acid sequences encoded by these 
ORFs are shown in SEQ ID NOS:3602 to 7001 . 

[0370] In some cases of the sequence listings in the present invention, nucleotide sequences, such as TTG, TGT, 
GGT, and the like, other than ATG, are read as an initiating codon encoding Met. 

[0371] Also, the preferred nucleotide sequences are SEQ ID NOS:2 to 355 arid 357 to 3501 , and the preferred amino 

30 acid sequences are shown in SEQ ID NOS:3502 to 3855 and 3857 to 7001 

[0372] Table 1 shows the registration numbers in the above-described databases of sequences which were judged 
as having the highest homology with the nucleotide sequences of the ORFs as the results of the homology search in 
the amino acid sequences using the homology-searching software Frame Search (manufactured by Compug n), 
names of the genes of these sequences, the functions of the genes, and the matched length, identities and analogies 

35 compared with publicly known amino acid translation sequences. Moreover, the corresponding positions w re con- 
firmed via the alignment of the nucleotide sequence of an arbitrary ORF with the nucleotide sequence of SEQ ID NO: 
1 . Also, the positions of nucleotide sequences other than the ORFs (for example, ribosomal RNA genes, transfer RNA 
genes, IS sequences, and the like) on the genome were determined. 

[0373] Fig. 1 shows the positions of typical genes of the Corynebactehum glutamicum ATCC 1 3032 on the genome. 
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Function 


hypothetical membrane protein 


2,5-diketo-D-gtuconic acic reductase 


5 -nucleotidase precursor 
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Table 1 (continued) 


Homologous gene 


Mycobacterium leprae 
MLCB1788.18 


Corynebacterium sp. ATCC 
31090 


Vibrio parahaemolyticus nutA 


Delnococcus radiodurans 
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Streptococcus pyogenes SF370 
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Function 


hypothetical protein 


extragenic suppressor protein 


polyphosphate glucokinase 


sigma factor or RNA polymerase 
transcription factor 


hypothetical membrane protein 




hypothetical protein 


hypothetical membrane protein 


hypothetical protein 


transferase 


hypothetical protein 


iron dependent repressor or 
diphtheria toxin repressor 


putative sporulation protein 


UDP-glucose 4-epimerase 




hypothetical protein 


ATP-dependent RNA helicase 
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length 
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Similaril 
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68.2 
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51.4 
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CO 
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98.7 


62.0 


99.1 




45.3 


24.4 


Table 1 (continued) 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv2699c 


Escherichia coliK12suhB 


Mycobacterium tuberculosis 
H37Rv RV2702 ppgK 


Corynebacterium glutamicum 
sigA 


Bacillus subtilis yrkO 




Mycobacterium tuberculosis 
H37RvRv2917 


Mycobacterium tuberculosis 
H37Rv RV2709 ! 


Mycobacterium tuberculosis 
H37Rv Rv2708c 


Streptomyces coelicolor A3(2) 
SCHS.OBc 


Corynebacterium glutamicum 
ATCC 13869 ORF 1 


Corynebacterium glutamicum 
ATCC 13869 dtxR ! 


Streptomyces aureofaciens 


Corynebacterium glutamicum 
ATCC 13869 (Brevi bacterium 
lactofermentum) galE 




Mycobacterium tuberculosis 
H37Rv Rv2714 


Saccharomyces cerevisiae 
YJL050Wdob1 
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2016257 


2018754 
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2023948 


2026379 


2029043 
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Initial 
(nt) 


2009570 


2010539 


2010555 


2011863 


2015496 


2016121 


2017966 


2018119 


2018202 


2018744 


2020293 


2022266 
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2022959 


2025270 
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2026494 
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Function 


hydrogen peroxide-inducible genes 
activator 




ATP-dependent helicase 


regulatory protein 




SOS regulatory protein 


galactitol utilization operon repressor 


phosphofructokinase (fructose 1- 
phosphate kinase) 


phosphoenolpyruvate-protein 
phosphotransferase 


glycerol-3-phosphate regulon 
repressor 


1 -phosphofructokinase or 6- 
phosphofructokinase 


PTS system, fructose-specific II8C 
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phosphocarrier protein 




uracil permease 


ATP/GTP-binding protein | 






diaminopimelate epimerase 
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Homologous gene 
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Streptomyces clavuligerus nrdR 




Bacillus subtilis dinR 


Escherichia coli K12 gatR 


Streptomyces coelicolor A3(2) 
SCE22.14C 


Bacillus stearothermophilus ptsi 


Escherichia coli K12 glpR 


Rhodobacter capsulatus fruK 


Escherichia coli K12 fruA 


Bacillus stearothermophilus XL- 
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Bacillus caldolyticus pyrP 
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2038591 


2039550 


2039618 


2042519 


2043508 


2045571 


2046028 


2046714 


2047320 


2048650 


2051106 


2051842 


2051845 
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(nt) 


2029177 


2031365 


2031478 


2035880 


2036409 


2036812 


2037815 


2038591 


2041321 
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2043736 


2045762 


2047295 
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Function 


tRNA delta-2- 

isopentenylpyrophosphate 

transferase 




hypothetical protein 






hypothetical membrane protein 


hypothetical protein 


glutamate transport ATP-binding 
protein 


NpjRRprjal r>n|unftn|irjfis nreriictfld to 
Neisserial polypeptides predicted to 

be useful antigens for vaccines and 

diagnostics 


glutamate transport system 
permease protein 


glutamate transport system 
permease protein 


regulatory protein 


hypothetical protein 




biotin synthase 


putrescine transport ATP-binding 
protein 


hypothetical membrane protein | 
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33.0 
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Homologous gene 


Escherichia coli K12 miaA 




Mycobacterium tuberculosis 
H37Rv Rv2731 






Mycobacterium tuberculosis 
H37Rv Rv2732c 


Mycobacterium leprae 
B2235.C2J95 


Corynebacterium glutamicum 
ATCC 13032 gluA 


Neisseria gonorrhoeae 


Corynebacterium glutamicum 
ATCC 13032 gluC 


Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
13032 gluD 


Mycobacterium leprae recX 


Mycobacterium tuberculosis 
H37Rv Rv2738c 




Bacillus sphaericus bioY 


Escherichia coli K12potG 


Bacillus subtilis ybaF 
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2065394 


2065667 


2067141 
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hypothetical protein 
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competence damage induced 
proteins 
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Homologous gene 


Mycobacterium tuberculosis 


Mycobacterium tuberculosis 
H37Rv RV2744C 


Mycobacterium tuberculosis 
H37Rv Rv2745c 


Streptococcus pneumoniae R6X 
cinA 


Streptococcus pyogenes pgsA 


Arabidopsis thatiana 
ATSP:T16I18.20 


Streptococcus pneumoniae 
DBL5 pspA 




Escherichia coli terC 
Escherichia coli terC 


Bacillus subtilis 168 spolllE 


Streptomyces coelicolor A3(2) 
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Function 

i 

i — 


ABC transporter 




hypothetical protein (gcpE protein) 




hypothetical membrane protein 


polypeptides can be used as 
vaccines against Chlamydia 
trachomatis 


1-deoxy-D-xylulose-5-phosphate 
reductoisomerase 








ABC transporter ATP-binding protein 


pyruvate formate-lyase 1 activating 
enzyme 


hypothetical membrane protein 


phosphatidate cytidylyltransferase 


ribosome recycling factor 


uridylate kinase 




elongation factor Ts 


30S ribosomal protein S2 | 




Matched 
length 
(a.a) 


m 

CN 
CN 




cn 
m 

CO 




m 
o 
xr 


r-. 


CN 
CO 








in 

CN 


CO 

in 

CO 


cn 


O) 
CN 


in 

CO 


cn 
o 




o 

CO 
CN 


m 

CN 


Similarity 


71.1 




73.8 




73.6 


43.0 


42.0 








75.1 


78.0 


74.5 


56.5 


84.3 


43.1 I 




76.8 


83.5 


Identity 
(%) 


37.3 




44.3 




o 

CO* 


36.0 


22.8 








37.1 


o 

CD 
CO * 


41.5 


33.3 


47.0 


28.4 




49.6 


m 


Homologous gene 


Bacillus subtilis 168 yvrO 




Escherichia coli K12 gcpE 




Mycobacterium tuberculosis 
H37Rv Rv2869c 


Chlamydia trachomatis 


Escherichia coli K12 dxr 








Thermotoga maritima MSB8 
TM0793 


Mycobacterium tuberculosis 
H37Rv 


Mycobacterium tuberculosis 
H37Rv Rv3760 


Pseudomonas aeruginosa 
ATCC 15692 cdsA 


Bacillus subtilis 168 frr 


Pseudomonas aeruginosa pyrH 




Streptomyces coeiicolor A3(2) 
SC2E1.42 tsf 


Bacillus subtilis rpsB 
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CD 

in 

CN 
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in 
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cn 
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CO 
CO 


CO 
CN 
CO 


CO 
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Terminal 
(nt) 


2126753 


2126926 


2127350 


2129461 


2128669 


2130950 


, 2129903 


2131762 


2131247 


2131825 


2133406 


in 

CO 
i " 


2136141 


2136235 


2137286 


2137936 


2139854 


2139003 


2140071 




Initial 
(nt) 


2126064 


2127087 


2128463 


2128850 


2129880 


2130306 


2131078 


2131322 


12131726 


CN > 

o : 
j 

CO ' 
CO ' 

CN * 


2134260 


2135551 


2135884 


2137089 


2137840 


2138664 


2138994 


2139827 


2140886 
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MO 
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2215 . 
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2220 


2221 
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Function 


hypothetical protein 


j site-specific recombinase 


hypothetical protein 


Mg(2+) chelatase family protein 


hypothetical protein 


hypothetical protein 


ribonuclease Mil 




signal peptidase 


Fe-regulated protein 




SOS ribosomal protein LI 9 


thiamine phosphate 
pyrophosphorylase 


oxidoreductase 


thiamine biosynthetic enzyme thiS 
(thiG1) protein 


thiamine biosynthetic enzyme thiG 
protein 


molybdopterin biosynthesis protein 
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length 
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LO 
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CD 
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56.8 
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28.4 


34.0 
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48.2 


30.2 
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C 
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o 
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35 
40 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv2891 


Proteus mirabilis xerD 


Mycobacterium tuberculosis 
H37Rv Rv2896c 


Mycobacterium tuberculosis 
H37Rv Rv2897c 


Mycobacterium tuberculosis 
H37Rv Rv2898c 


Mycobacterium tuberculosis 
!H37Rv Rv2901c 


Haemophilus influenzae Rd 
HI1059 rnhB 
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Staphylococcus aureus sirA 




Bacillus stearothermophilus rpIS 


Bacillus subtilis 168 thiE 


Streptomyces coelicolor A3(2) 
SC6E10.01 


Escherichia coli K12 thiS 


Escherichia coli K12thiG 


Emericella nidulans cnxF 
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2153113 
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Initial 
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2141257 
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2145586 
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transcriptional accessory protein 


sporutatioh-specific degradation 
regulator protein 


dicarboxylase translocator 


2-oxoglutarate/malate translocalor 


3-carboxy-cis ( cis-muconate 
cycloisomerase 




tRNA(guanine-NI)- 
methyltransferase 


hypothetical protein 


16S rRNA processing protein 


hypothetical protein 


30S ribosomal protein 516 


inversin 

ABC transporter 
ABC transporter 

sianal recoanition particle protein 


cell division protein 
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length 
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Homologous gene 
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Function 






glucan 1,4-alpha-glucosidase or 
glucoamylase S1/S2 precursor 




chromosome segregation protein 


acylphosphatase 




transcriptional regulator 


hypothetical membrane protein 








cation efflux system protein 


formamidopyrimidine-DNA 
glycosylase 


ribonuclease III 


hypothetical protein 


hypothetical protein 


transport protein 


ABC transporter 


hypothetical protein 
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CO 
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76.6 


66.7 


76.5 


62.5 


76.9 


55.6 


58.8 


62.6 




Identity 
(%) 






22.4 




48.3 


51.1 




23.9 


39.3 






46.8 


36.1 


40.3 


35.8 
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26.3 


26.6 


35.3 




Homologous gene 






Saccharomyces cerevisiae 
S288C YIR019C stal 




Mycobacterium tuberculosis 
H37Rv Rv2922c smc 


Mycobacterium tuberculosis 
H37Rv RV2922.1C 




Escherichia coliKl2yfeR 


Mycobacterium leprae 
MLCL581.28C 






Dichelobacter nodosus gep 


Escherichia coli K12 mutM or ! 
fpg 


Bacillus subtilis 168 rncS 


Mycobacterium tuberculosis 
H37Rv Rv2926c 
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tn 

CO 
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in 
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in 


cn 
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2187233 
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2190540 


2193165 
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c 
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o 

CN 
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tc 
tc 

CN 
CN 
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) tc 

1 CN 

J p> 


2268 
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tOO i P- 
CMM j CN 
CMM 1 CN 


2271 
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r- 

Pv 
CN 


2274 
2275 


CD ! i*» 
r- i r- 

CM ; CN 
CM CN 
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R 

CM 

01 1 
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Function 


hypothetical protein 


c 
a> 
o 
a 

-e 
o 

Q 
tn 
c 
ro 

<D 

1 2 

CL t/ 




maltodextrin phosphorylase / 
glycogen phosphorylase 

i hvoothetical protein 


prolipoprotein diacylglyceryl 
transferase 


indole-3-glycerol-phosphate 
synthase / anthranilate synthase 
component II 


3 
S 

5 


hypothetical membrane protein 


phosphoribosyl-AMP cyclohydrolase 


cyclase 


inositol monophosphate 
phosphatase 


phosphoribosylformimmo-5- 
aminoimidazole carboxamide 
ribotide isomerase 


glutamine amidotransferase 


chloramphenicol resistance protein 
or transmembrane transport protein 




Matched 
length 
(aa) 


m 
o 


CO c 

in c 

CO t- 


) 

) 


in 

r- CT 
CO cv 


CO 

1 CN 


co 

CO 


CO 
CN 
CN 


cn 

CO 


CO 

in 

CN 


CN 
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CN 
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CN 


CN 
O 

■<r 


Similarity 
(%) 


43.7 


CO O 

CO if 


) 


67.4 


65.5 


62.1 


58.8 


79.8 


97.7 


94.0 


97.6 


92.4 


54.0 


Identity 
{%) 


21.0 


32.9 

07 4 




36.1 

OO Q 


■} 

•>' 

1 CO 


CD 

ai 

CN 


29.4 
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Emericella nidulans trpC 
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Rhodobacter sphaeroides ATCC 
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Corynebacterium glutamicum 
AS019 hisF 


Corynebacterium glutamicum 
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Function 




imidazoleglycerol-phosphate 
dehydratase 


histidinol-phosphate 
aminotransferase 


histidinol dehydrogenase 


serine-rich secreted protein 






histidine secretory acid phosphatase 


tet repressor protein 


g vcoqen debranching enzyme 
g ycogen deorancning enzyme 


hypothetical protein 


oxidoreductase 


myo-inositol 2-dehydrogenase 


galactitol utilization operon repressor 


ferrichrome transport ATP-binding 
protein or ferrichrome ABC 
transporter 


hemin permease 


iron-binding protein 


iron-binding protein 


hypothetical protein | 
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hisC 
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ATCC 607 hisD 


Schizosaccharomyces pombe 
SPBC215.13 






Leishmania donovani SAcP-1 


Escherichia coli plasmid RP1 
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Sinorhizobium meliloti idhA 
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Bacillus subtilis 168fhuC 
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Function 


DNA polymerase lit epsilon chain 




maltooligosyl trehalose synthase 


hypothetical protein 










alkanal monooxygenase alpha chain 


hypothetical protein 






maltooligosyltrehalose 
trehalohydrolase 


1 hypothetical protein 


threonine dehydratase 






Corynebacterium glutamicum AS019 


DNA polymerase III 


chloramphenicol sensitive protein 


histidine-binding protein precursor 


hypothetical membrane protein 
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length , 
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Similarity 
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50.1 




68.6 
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CM 


72.4 


99.3 






49.6 
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| 36.5 
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22.7 
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Table 1 (continued) 


Homologous gene 


Streptomyces coelicolor A3(2) 
SCI8.12 




Arthrobacter sp. Q36 treY 


Deinococcus radiodurans 
DR1631 










Photorhabdus luminescens 
ATCC 29999 luxA 


Streptomyces coelicolor A3(2) 
SC7H2.05 






Arthrobacter sp. Q36 treZ 


Bacillus subtilis 168 


Corynebacterium glutamicum 
ATCC 13032 ilvA 






Catharanthus roseus metE 


Streptomyces coelicolor A3(2) 
dnaE 


Escherichia coli K12rarD 


Campylobacter jejuni DZ72 hisJ 


Archaeoglobus fulgidus AF2388 
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Function 


short chain dehydrogenase or 
general stress protein 


diaminopimelate (DAP) 
decarboxylase 


cysteine synthase 




ribosomat large subunit 
pseudouridine synthase D 


lipoprotein signal peptidase 




oleandomycin resistance protein 




hypothetical protein 


L-asparaginase 1 


DNA-damage-inducible protein P 


hypothetical membrane protein 


transcriptional regulator 




hypothetical protein 


isoleucyl-tRNA synthetase 
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Homologous gene 


Bacillus subtilis 168 ydaD 


Pseudomonas aeruginosa lysA 


Alcaligenes eutrophus CH34 
cysM 




Escherichia coli K12rluD 


Pseudomonas fluorescens NCIB 
10586 IspA 




Streptomyces antibioticus oleB 




Rhodococcus erythropolis orf17 


Bacillus licheniformis 


Escherichia coli K12dinP 


Escherichia coli K12 ybiF 


Streptomyces coelicolor A3(2) 
SCF51.06 




Streptomyces coelicolor A3(2) 
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Saccharomyces cerevisiae 
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db Match 


sp:GS39_BACSU 


sp:DCDA_PSEAE 


z> 

LU 

o 

—J 

< 

CO 

> 
o 

cL 
cn 




_j 
o 

CJ 
LU 

o 1 

z> 

—1 
q: 

iOL 
cn 


-j 

LU 
UJ 
GO 
CL 

2' 

CO 
_J 

CL 
CO 




pir:S67863 




prf;2422382P 


_j 
o 
< 

CO 

o' 

CL 
CO 
< 
CL 

in 


sp:DINP_ECOLI 


o 

CJ 
LU 

u.' 

CO 

>- 

CL 

CO 


gp:SCF51_6 




m 
u. 

o 

CO 

ci 

CO 


sp:SYIC_YEAST 








ORF 
(op) 


CO 

r- 

CO 


1287 


r~ 

in 

CO 


CO 

in 


O 
ro 

CO 


ro 
m 


1002 


1650 


CO 

o 

CO 


o 
o 

CO 


LO 

r- 
CO 


1401 


CO 

in 

CO 


1002 


CN 
CO 


r*- 

CN 
CO 


3162 


CO 
CN 


1095j 


45 


Terminal 
(nt) 


2254683 


2255738 


2258362 


2259421 


2260002 


2260934 


2262689 


2264499 


2265298 


2264509 


2266394 


2266897 


2268388 


2269260 


2270435 


2270258 


2270988 


2274473 


2274767 | 


50 


Initial 
(nt) 


2255558 


2257024 


2259312 


2259999 


2260931 


2261467 


2261688 


2262850 


2264996 


2265108 


2265420 


2268297 


2269245 


2270261 


2270304 | 


2270884 


2274149 


2274688 


2275861 




SEQ 

NO. 
(a.a) 


5836 


5837 

i 


5838 


5839 


5840 


5841 


5842 


5843 


5844M 


5845] 


CO 
CO 

in 


r- 

•*r 

CO 
lO 


5848 


5849 


5850 


5851 


5852 


5853 


5854 


55 


SEQ 
NO. 
(DNA) 


2336 


2337 


2338 


2339 


2340 


CO 

CN 


CN 

■*x 
CO 
CN 


2343 


2344 | 


1 in 
> ro 

1 CN 


CO 
T 

1 CO 


l 2347 


2348 


2349 


2350 


2351 


2352 


2353 


2354 



EP1 108 790 A2 



e 
o 

O- 



X) 

E 
E 



o 

Q. 



0) 

o 

« 

£Z C 

o 2 



o 



O 
k_ 
Q. 
C 

.2 
> 



o 
c 
a> 
o 
a. 
c 
g 

ra 

^ ai 

il 
|.i 

O TJ 



n w ) 
23 .9>?> 



2 * 2 



E o-S 

<0 (0 m 

S c -o 

>^ c o 
o £ ex 



i C- O — 

zero" 

• >> CL - 

n <5 2 

y O >s 

3 CO CL 



O CO 



S 2 
3| 
8 5 

f s 



(D 
C 

c 

is 

o 

E ' 

E w 

-=r CO 

2r co 

a> — 

" 01 

eg *- 

41 

q. co 

D § 

Z) co 







O 




E 




CO 




k_ 

3 




E 




>* 




"5 




CJ 
CD 
1 


a 


-g 


c 




pho- 


apep 


in 




O 


e 


x: 


CD 


cu 


CL 



9 ° 

O CL 

co E co 
3 E.2 1 
c «Z 
£ x> 

O CM CO 
CO _L < 

q-I ? 

D 3 5 
3 cn co 



CO 0) — ' 



CD 
CM 



CN 
T 



CN 
CN 
CM 



CO 
00 



CN 

co 



LO 

to 



CO 



o 
o 
o 



CO 
CD 



CM 
CO 



CO 
CO 

CO 



CO 
CO 
CO 



T3 
O 
=3 
C 

o 
o 



CD 
C 
0) 
CO 



0) 
X) 

it 

x> > 

o cr 

1*2 



E 

3 

a> 

"G 

CO 

-9 

«* life 



E c 



CO 

c 



in 

CO 
CM 



a 

m 

CM 

o 

CM 
CM 

ri 

CL 



CO 

in 



CO 
CO 



CM 
CM 



CO 

r- 

CN 
CN 



CD 

co 
CO 

r- 

CM 
CM 



CO 

£ 



a 

CO 
-Q 
0J 
C 

3 s 



511 



s 

LO 

o 

CL 



CO 
CN 
O 

cn 

o 
CD 
< 

CL 



CO 
CO 
CO 



CO 
CO 



CM 
CM 



O 
CO 
CO 



CM 
CM 



CM 
O 
CO 



CM 
CM 



CO 
CM 



CO 
LO 



CN 
ID 
CM 
CO 



E 

™ 5 

E E 

3 CO 

= CD 

0J CO 

"o CO 

(0 

5 a 

CD < 



o 

o 

-2 § 

~ «/i 

E ^ 

3 Oi 

C CO 

<U CO 

"XS co 

CO 

r§ O 

»— r— 

CD < 



o 

o Q 

iS 3 

E E 

3 CO 

™ CD 

0) CO 

"o CO 

CO 

fi o 

CD < 



CO 

i— 

■ E 

CM 



CO 

8 

CM 

5 



CM 



CD 



CD 

cL 



O 

o 

UJ 

t 

< 
or 



o 
vn 

CD 



CD 
CO 
CM 
CO 
CM 
CM 



CM 
CO 



CO 
CM 
CM 



o 



CM 
CD 
CO 
CD 



4: 

3 

E 

CM 

u 

CO 

o 
a3 
u 
LLi 



O 
O 

LU 
LL 

a 



cn 

CD 



CO 
LO 
CO 

r— 

CO 
CM 
CM 



LO 
CO 
CO 
CM 
CM 



2§ 5 

to 2 «. 



o n < 
lu y z 

CO z o 



CO 
LO 



LO 
LO 



CO 
LO 



r^- 1 co 

LO I LO 

CO CO 

LO LO 



o 

CO 
CO 
LO 



CM 
CO 



CO 
CO 



CO 
LO 



CO ; O 

LO CO 

CO 1 CO 

CM CM 



CM 
CO 
CO 



CO 
CD 



CD 



LO 
CD 
CO 
CM 



CO 
CO I CD 

co ! co 
lo ; uo 



co 
co 
co 



CD j 

CO 

CN 



CD 
CD 
CO 
CM 



CO 
CO 
CO 
CM 



EP 1 108 790 A2 



5 
10 


Function 
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penicillin binding protein 


penicillin-binding protein 




hypothetical protein 


hypothetical membrane protein 


hypothetical protein 




hypothetical protein 


5,10-methylenetetrahydrofolate 
reductase 


dimethylallyltranstransferase 


hypothetical membrane protein 




hypothetical protein 


eukaryotic-type protain kinase 




hypothetical membrane protein 
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length 
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Similarity 
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68.8 
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Identity 
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28.2 
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42.6 
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34.2 
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Table 1 (continued) 


Homologous gene 


Bacillus subtilis 168 murE 


Brevibacterium lactofermentum 
ORF2 pbp 


Pseudomonas aeruginosa pbpB 




Mycobacterium tuberculosis 
H37Rv Rv2165c 


Mycobacterium leprae 
ML.CB268.11c 


Mycobacterium tuberculosis 
H37Rv Rv2169c 




Mycobacterium leprae 
MLCB268.13 


Streptomyces lividans 1326 
metF 


Myxococcus xanthus DK1050 
ORF1 


Mycobacterium leprae 
MLCB268.17 




Mycobacterium tuberculosis 
H37Rv Rv2175c 


Streptomyces coelicotor A3(2) 
pkaF 




Mycobacterium leprae 
MLCB268.23 
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LU 
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2295376 
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2298438 
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34.2 




44.4 


41.2 




59.1 


49.0 






59,1 






54.6 


30.4 


25 

CJ 

c: 

c 
o 

CJ 

30 *~ 

CD 
.Q 
.CO 
!— 

35 
40 


Homologous gene 


Streptomyces coelicolor A3(2) 
SC4A7.08 












Bacillus subtitis 168 phoD 




Streptomyces coelicolor A3(2) 
SCI51.17 


Mycobacterium tuberculosis 
H37Rv RV2342 




Mycobacterium smegmatis 
dnaG 


Streptomyces aureofaciens BMK 






Mycobacterium smegmatis 
mc2155 glmS 






Mycobacterium smegmatis dgt 


Neisseria meningitidis NMA0251 


db Match 


gp:SC4A7_8 












<o 
o 
< 

CO 

Q 1 
CD 
CL 
Q. 

CL 
U) 




gp:SCI51J7 


pir:G70661 




prt:2413330B 


"~i 

r*- 

co 

TT 

CO 

ro 

a 

CL 

cn 






CO 
CO 

to 
o 

UL 
< 
CL 
CO 






prf:2413330A 


gp:NMA1Z2491 23 






CO 
CM 
CO 


CM 
CO 




CO 
LO 


CO 
CO 


CM 
CO 


1560, 


r- 


1836: 


o 

CM 


to 
r- 

CO 


1899 


CM 
CD 


CO 
CM 


CD 
CO 
CD 


1869 


CM 
CO 


1152 


1272 


m 
to 


45 


Terminal 
(nt) 


2391184 


2392075 


2392579 


2393970 


2393973 


2394935 ; 


2396763 


2395273 ! 


2399099 


2399397 


2399668 


2399405 


2401834 


2402080 j 


2402530 


2402144 


2404846 


2406822 


2404987 


2406262 


50 


Initial 
(nt) 


2392008 


2392566 \ 


2393349 


2393425 


2394437 


2394594 


2395204 


2395986 


2397264 


2399158 


2400342 


2401303 


2401373 j 


2401838 


2403165 


2404012 


2404523 


2405671 


2406258 


2406936 




2 b 


5975 


5976 


5977 


5978 


5979 


5980 


5981 


5982 


5983 


5984 


5985 


5966 


5987 
5988 


5989 


5990 


5991 


5992 


5993 


cn 
cn 
to 


55 


SEQ 
NO. 

(UNA) 


2475 


| 2476 


2477 


2478 


2479 ! 


2480 ! 


2481 


2482 


2483 


2484 


2485 
2486 


2487 


oo 

CO 
*T 
CM 


cn 

CO 
CM 


o 
cn 
rr 
CM 


2491 


2492 J 


2493 


2494 
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5 
10 


Function 


hypothetical protein 


hypothetical protein 




glycyMRNA synthetase 


bacterial regulatory protein, arsR 
family 


ferric uptake regulation protein 


hypothetical protein (conserved in 
C.glutamicum?) 


hypothetical membrane protein 


undecaprenyl diphosphate synthase 


hypothetical protein 


Era-like GTP-binding protein 


hypothetical membrane protein 


hypothetical protein 


Neisserial polypeptides predicted to 
be useful antigens for vaccines and 
diagnostics 


phosphate starvation inducible 
protein 


hypothetical protein 




15 


Matched 
length 
(a.a) 


CM 
OJ 
CO 


CO 

ro 




CO 

o 
uo 


cn 

CO 


CM 

CO 


CO 
CM 
LO 


CM 
CM 


CO 
CO 
CM 


m 

CM 


CO 
Ol 
CM 


CM 
CO 


m 


in 

GO 


CO 


GO 
CM 






>s 




































20 


Similari 
(%) 


63.6 


54.4 




69.9 


73.0 


70.5 


46.7 


67.0 


71.2 


74.3 


70.3 


CM 
CO 


86.0 


50.0 


84.6 


75.4 






Identity 
(%) 


31.1 


CO 
CM 




46.1 


49.4 


CO 

*r 

CO 


24.8 


40.6 


43.4 


ih 


39.5 


52. B 


65!0 


45.0 


61.1 


44.0 




25 ^ 
0> 

.£ 
*e 
o 

30 *~ 
CD 

35 
40 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv2345 


Drosophila melanogaster 
CG10592 




Thermus aquaticus HB8 


Mycobacterium tuberculosis 
H37Rv RV2358 furB 


Escherichia coli K12fur 


Mycobacterium tuberculosis 
H37Rv Rv1128c 


Streptomyces coelicolor A3(2) 
h3u 


Micrococcus luteus B-P 26 uppS 


Mycobacterium tuberculosis 
H37Rv Rv2362c ! 


Streptococcus pneumoniae era 


Mycobacterium tuberculosis 
H37Rv Rv2366 


Mycobacterium tuberculosis 
H37Rv Rv2367c 


Neisseria meningitidis 


Mycobacterium tuberculosis 
H37Rv Rv2368c phoH 


Streptomyces coelicolor A3(2)* 
SCC77.19C 




db Match 


CM 

s 

o 
co 

LJ 
CL 


gp:AE003565_26 




pir:S58522 


pir:E70585 


_j 
O 
o 

tu . 

a: 

Li- 
Cc 
VI 


cn 

CO 

m 
O 

< 

LJ 
CL 


gp:AF162938_1 


=> 
—i 
o 

a. 

CL 

CL 
CO 


CO 
CO 

in 
o 
r-. 
< 

"o_ 


gp;AF072811_1 


=> 
. t- 
o 
> 

uj 1 
o 

> 

CL 
co 


sp:YN67_MYCTU 


GSP:Y75650 


3 
h- 
O 
> 

J 

o 

X 
Q_ 

CL 

CO 


Ol 
r— 

r^' 

o 

O 
CO 
cL 
cn 






si 


2037 


CO 
CO 

^- 


CM 
CO 
lO 


1383 


o> 

CO 
CO 


CM 
CO 

-<r 


1551 


CM 
CO 

r» 


Oi 
CM 
r-^ 


CO 
CN 


tn 

O) 


1320 


CO 
CO 

in 


<o 

CM 


1050 


CO 

CM 


CM 
CO 


45 


Terminal 
(nt) 


2409029 


2409779 


2410280 


2410956 


2412948 


2413423 


2415118 


2415298 


2416371 


2417222 


2417969 


2418990 


2420313 


2421236 


2420900 


2421975 


2423791 


50 


Initial 
(nt) 


2406993 


2410264 


2410861 


2412338 


2412580 


2412992 


2413568 


2416089 


2417099 


1 

2417947 


2418883! 


2420309 


2420900 


2420973 


2421949 


2422697 


o 
m 

CO 
CM 
CM 

CM 




SEQ 

NO. 
(a.a.) 


5995 


5996 


5997 


5998 


6665 


6000 


6001 


6002 


6003 


6004 


6005 


6006 


6007 


6008 


6009 


6010 


6011 


55 


SEQ 

NO. 

(DNA) 


2495 


2496 


2497 


: 2498 


2499 


2500 


2501 


2502 


2503 


2504 


2505 I 


2506 


2507 


2508 


2509 


2510 


2511 



EP 1 



108 790 A2 



C 

c 

8 



Function 


heat shock protein dnaJ 


heat-inducible transcriptional 
repressor (groEL repressor) 


oxygen-independent 
coproporphyrinogen III oxidase 


agglutinin attachment subunit 
precursor 






long-chain-fatty-acid-CoA ligase 


CD 
</> 
CO 

c 
ra 

o 

c 

CO 

a 

7> 
CO 
-C 
CL 
CO 


ABC transporter, Hop-Resistance 
protein 


Netsserial polypeptides predicted to 
be useful antigens for vaccines and 
diagnostics 


polypeptides predicted to be useful 
antigens for vaccines and 
diagnostics 






peptidyl-dipeptidase 


carboxylesterase 


glycosyl hydrolase or trehalose 
synthase 


hypotnetical protein 


Matched 
length 
(a.a) 


o 

CO 
CO 


CO 
CO 


o 

CM 
CO 


CO 

^* 






to 


GO 
CO 

r- 


S 
to 


CO 

to 


o 






a 
cn 
to 


CO 
LO 

n- 


cn 

LO 


CO 


Similarity 
(%) 


77.4 


79.6 


64.1 


64.9 






75.1 


55.4 


64.4 


51.0 


53.0 






68 3 


45.7 


84.9 


58.8 


Identity 
(%) 


47.1 


48.2 


33.1 


36.6 






o 

CO 


26.3 


29.5 


o 


47.0 






CO 

o 


24.1 


65.2 


32.1 ! 


Homologous gene 


Streptomyces albus dnaJ2 


Streptomyces albus hrcA 


Bacillus stearothermophilus 
hemN 


Saccharomyces cerevisiae 
YNR044W AGA1 






Streptomyces coeticolor A3(2) 
SC6G10.04 


Escherichia coli K12 malQ 


Lactobacillus brevis plasmid 
horA 


Neisseria gonorrhoeae 


Neisseria meningitidis 






Salmonella typhimurium dcp 


Anisopteromalus calandrae ' 


Mycobacterium tuberculosis 
H37Rv Rv0126 1 


Mycobacterium tuberculosis 
H37RvRv0127 


db Match 


pr1:2421342B 


< 

CM 

CO 

CM 
TT 
CM 

i= 
o. 


prt.2318256A 


co 
< 

Ui 

< 
< 

CL 
t/1 






5 

CO 

a 

CO 
CL 

cn 


sp:MALQ_ECOLI 


rM 1 

LO 

in 
o 
o 
GQ 
< 

CL 

cn 


r-. 

CM 
CO 
T 

> 
CL 

CO 

CD 


GSP.Y74829 






>- 
i — 

i 

< 

cn 

0.' 

U 

o 

ioL 
%n 


gp:AF064523J 


pirG70983 


pir;H70983 i 


ORF 
(bp) 


1146 


1023 


o 
cn 
cn 


cn 

LO 


CO 

cn 

CO 


CO 
N- 
CO 


1845 


2118 


1863 


LO 
LO 
CM 


CO 
CO 
CO 


o 

CO 


TT 

O 

CM 


2034 


1179 


1794 


1089 


Terminal 
(nt) 


2422700 


2423915 


2424965 


2426699 


2426776 


2427807 


2428184 


2432413 


2434370 


2433614 


2433875 


2434440 


2434573 


2434805 


2438049 I 


2439906 


2440994 


Initial 
(nt) 


2423845 


2424937 


2425954 


2426181 


2427468 


2428184 


2430028 


2430296 


2432508 


2433868 


2434207 


2434619 


2434776 


2436838 


2436871 


2438113 


2439906 


SEQ 

NO. 
(a.a.) 


6012 


6013 


6014 


6015 


9109 


6017 


6018 


6019] 


6020 


6021 


6022 


6023 


6024 


6025 


6026 


6027 


6028 


SEQ 
NO. 
(DNA) 


2512 
2513 


2514 


2515 


2516 


2517 


2518 


2519 


2520 


2521 


2522 


2523 


2524 


2525 


[25261 


2527 


2528 
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5 
10 


Function 


isopentenyl-diphosphate Detta- 
isomerase 












beta C-S lyase (degradation of 
aminoethylcysteine) 


branched -chain amino acid transport 
system carrier protein (isoleucine 
uptake) 


alkanal monooxygenase alpha chain 




malonate transporter 


glycolate oxidase subunit 


transcriptional regulator 




hypothetical protein 




heme-binding protein A precursor 
(hemin-binding lipoprotein) 


oligopeptide ABC transporter 
(permease) 


dipeptide transport system 
permease protein 


oligopeptide transport ATP-binding 
protein 


15 


Matched 
length 
(aa) 


O) 
CO 












LO 

CM 
CO 


CO 
CM 


CO 
CO 




xr 

CM 
CO 


CO 
CO 
XT 


CO 

0 

CM 




CO 




CO 

m 


tn 

CO 


CM 


CM 

r- 
co 














































20 


Similari 
(%) 


57.7 












100.0 


100.0 


49.0 




LO 
O 
CO 


55.1 


65.0 




57.6 




55.5 


73.3 


74.5 


66.4 




Identity 
(%) 


31.8 












99.4 


99 8 


21.6 




25.9 


27.7 


25.6 




in 

CM 
CM 




27.5 


40.0 


43.2 


37.4 


25 ^ 

CD 

a 
c 

c 
o 

30 

.o 


Homologous gene 


Chlamydomonas reinhardtii ipil 












Corynebacterium glutamicum 
ATCC 13032 aecD 


Corynebacterium glutamicum 
ATCC 13032brnQ 


Vibrio harveyi luxA 




Sinorhizobium meliloti mdcF 


Escherichia coli K12 glcD 


Escherichia coli K12ydfH 




Salmonella typhimurium ygiK 




Haemophilus influenzae Rd 
HI0B53 hbpA 


Bacillus subtilis 168 appB 


Escherichia coti K12 dppC 


Escherichia coli K12 oppD 


35 
40 


db Match 


pir:T07979 












gp:CORCSLYS_1 


sp:BRNQ_CORGL 


< 

X 
CD 
> 

3' 

Z) 
_J 

CL 
tn 




CM 

LO 
LO 

LL 
< 

CJ) 


sp:GLCD_ECOLI 


_j 
O 
a 

LU 

x' 

LL. 

a 

>- 

CL 
cn 




r- 
—I 
< 
CO 

> 

id. 
tn 




2 

LU 
< 
X 

£' 

CD 
X 

CL 
to 


sp:APPB_BACSU 


sp:DPPC_ECOLI 


prf:2306258MR 




si 


LO 
CO 
If) 


CM 
CM 
CM 


CO 
ro 


1755 


o 

CO 
CO 


LO 


LO 
r- 
O 


1278 


CO 

o> 


CM 

to 


r- 
CM 
CO 


2844 


r- 


CM 
CO 
CM 


1347 


ro 

CM 
^* 


1509 


CO 
CO 

cn 


826 


1437 


45 


Terminal 
(nt) 


2441005 


2441890 


2442792 


2441602 


2443356 


2444033 


2445709 


2446993 


2447998 


CO 
CM 

CO 
O 

tn 
rr 

CM 


2450859 


2451794 


2455435 


2455452 


2455720 


2457337 


2459371 


2460336 


2461167 


2462599 


50 


Initial 
(nt) 


2441589 


2441669 


2442355 


2443356 


2444015 


2444551 


m 

CO 

r- 

T 
T 

CM 


2445716 


2447021 


TT 
CO 
O 
LO 

CM 


2451785 


2454637 


2454725 | 


2455733 


2457066 


2457759 


2457863 


2459371 


2460340 


2461163 




So « 


i 

6029 


6030 


6031 


6032 


6033 


6034 


6035 


6036 


6037, 


6038 


6039 1 


0 

«a* 

CJ 
CD 


6041 


6042 


6043 


6044 


in 

TT 

O 
CO 


CO 

«cr 
O 

CO 


6047 


6048 


55 


So| 

| CO 2 O 


2529 


2530 


CO 
(CM 


2532 


[25331 


2534 


2535 


2536 


2537 


2538 


2539 


2540 


2541 


2542 


CO 

s 


2544 


2545 


2546 


2547 


2548 
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Function 


hypothetical protein 


hypothetical protein 


ribose kinase 


hypothetical membrane protein 

i . — . 




sodium-dependent transporter or 
odium Bile acid symporter family 


apospory-associated protein C 




thiamine biosynthesis protein x 


hypothetical protein 


glycine betaine transporter 








large integral C4-dicarboxylate 
membrane transport protein 


small integral C4-dicarboxylate 
membrane transport protein 


C4-dicarboxylate-binding 
periplasmic protein precursor 


extensin I 


| GTP-binding protein j 


Matched 
length 
(aa) 


CO 

o 


m 


o 
o 

CO 


to 

CO 




CO 
CN 


in 

CO 
CN 




CO 
CO 


cn 


o 

CO 








CO 
TT 


CO 
^* 


r-. 

CM 
CNI 


CO 
T 


CO 

o 

CO 


Similarity 
(%) 


p 


58.0 


65.0 


64.6 




61.6 


51.2 




100.0 


999 


71.7 








71.9 


co 


© 
cn 

CO 


73.0 


83.6 


Identity 
(%) 


35.0 


29.3 


o 


39.9 




31.3 


28.5 




100.0 


42.6 


39.8 








34.6 


33.9 


28.2 


63.0 


58.7 


Homologous gene 


Aeropyrum pernix K1 APE 1580 


Aqutfex aeolicus VF5 aq_768 


Rhizobium etli rbsK 


Streptomyces coelicolor A3(2) 
SCM2.16C 




Homo sapiens 


Chlamydomonas reinhardtii 




Corynebacterium glutamicum I 
ATCC 13032 thiX 


Mycobacteriophage D29 66 1 


Corynebacterium glutamicum 
ATCC 13032 betP 








Rhodobader capsulatus dctM 


Klebsiella pneumoniae dctQ 


Rhodobader capsulatus B10 
dctP 


Lycopersicon esculentum 
(tomato) 


Bacillus subtrlis 168 lepA 


db Match 


PIR:G72536 


pir;D70367 


prf:2514301A 


gp:SCM2je 




sp:NTCI_HUMAN 


CO 
CN 

m 

CO 
U- 

< 

CL 

cn 




o 
a: 
O 
o 

X 
X 

h- 

m 


Q 

CL 

m . 

CO 
CO 

o 
> 

CL 

%n 


-j 
o 

rz 
O 
o 

Q.' 
LU 

m 

Q_ 
</> 








prf:2320266C 


gp:AF1B6091J 


< 
o 
o 
r 

5 

a 
o 

CL 
Ul 


PRF.1806416A 


CO 

o 
< 
m 

<• 

CL 
LU 
_l 

CL 

%n 


n 


r-- 
O 

m 


o> 

m 


CO 

o 

CO 


1425 


CO 

o 

CO 


CN 

r-- 
cn 


CO 
CO 


CO 
CO 
CO 


o 
r- 
in 


CO 
CO 

in 


1890 


S 

CO 


1608 


CO 
CO 


1311 


o 

GO 


f-^ 


CO 
TT 

CN 


1845 


Terminal 
(nt) 


2461543 


2462602 


2464143 


2465768 


2465465 


2466038 


2467922 


2470678 


2472819 


2472893 


2475542 


2477492 


j 2479251 


2479762 


2479898 


2481213 


2481734 


2484087 


2482548 


Initial 
(nt) 


2462049 


2463150 


2463241 


2464344 


2465767 


2467009 


2467077 


2470313 


2472250 


2473480 


2473653 


2476497 


2477644 


2479379 


CO 

o 

CN 
CO 
CN 


2481692 


o 

CO 

CN 
CO 

CN 


CO 
T 

CO 
CO 

CN 


2484392 


So ™ 


6049 


6050 


6051 


6052 


CO j Tf 

m m 
o o 

CO | CO 


6055 


6056 


6057 


6058 


6059 


0909 


6061 


6062 


6063 


6064 


6065 


6066 


6067 


2 ii 

to Z g 


2549 


o 
in 
in 


2551 


2552 


co i 

m 1 in 
in in 

CM CM 


[2555 


2556 


2557 


2558 


2559 


2560 


2561 
2562 


2563 


*r 

CO 

in 

CN 


2565 


2566 


[2567" 
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for 


for 






















Function 


hypothetical protein 


30S ribosomal protein S20 


thrreonine efflux protein 


ankyrin-iike protein 


hypothetical protein 


late competence operon required 
■ DNA binding and uptake 


late competence operon required 
DNA binding and uptake 




hvpothetical protein 
hypotneticai protein 


phosphoglycerate mutase 


hypothetical protein 


hypothetical protein 




gamma-glutamyl phosphate 
reductase or glutamate-5- 
semialdehyde dehydrogenase 


D-isomer specific 2-hydroxyacid 
dehydrogenase 




GTP-binding protein 


atched 
ength 
(a.a j 


to 

CO 


in 

CO 


o 

CM 


cn 

CN 


co 

CO 


r*. 

CM 

in 


m 

" CO 




coro 

CNTN 


to 

CO 
CM 




cn 




CM 
CO 

TT 


o 

CO 




co 










































































milarit 
(%) 


69.7 


72.9 


67.1 


80.6 


74.1 


49.7 


63.6 




rro 
icxb 

VXD 


66.4 


86.3 


85.3 




99.8 


o 

o 
o 




78.2 


to 




































Identity 
(%) 


41.6 


CM 
CO 


30.0 


61.2 


46.0 


21.4 


30.8 




coro 


46.8 


55.6 


66.0 




99.1 


99.3 




58.9 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv RV2405 


Escherichia coli K12rpsT 


Escherichia coli K12 rhtC 


Streptomyces coelicolor A3(2) 
SC6D7.25. 


Mycobacterium tuberculosis 
H37Rv Rv2413c 


Bacillus subtilis 168 comEC 


Bacillus subtilis 168 comEA 




Streptomyces coelicolor A3(2) 
SCC123.07c. 


Mycobacterium tuberculosis 
H37Rv Rv2419c 


Mycobacterium tuberculosis 
H37Rv Rv2420c 


Streptomyces coelicolor A3(2) 
SCC123.17c. 




Corynebacterium glutamicum 
ATCC 17965 proA 


Corynebacterium glutamicum 
ATCC 17965 unkdh 




Streptomyces coelicolor A3(2) 
obg 


db Match 


pir:H70683 


sp:RS20_ECOLI 


— i 
o 
o 

X 
(X 

CL 

to 


tr> 

CM 

I 

r*- 
Q 
CO 

O 
CO 

CL 

cn 


pir:H70684 


sp:CME3_BACSU 


sp:CME1_BACSU 




com 

CMTM 

UO 
OO 
COO 
b'cL 

OCT) 


pir:F70685 


pir;G70685 


gp:SCC123J7 




sp:PROA_CORGL 


_j 

(D 
rr 
O 
o 

<• 

rr 

CL 

ioL 

to 




gp:D87915J 




cn 
o 

CO 


CO 
CM 


cn 

CO 

to 


Lf) 
O 


in 
r- 
cn 


1539 


CM 
CO 

in 


CN 
CN 
CO 


CNCN 
CNTN 
COCO 


CO 

o 

r- 




CO 

h- 
co 


1023 


1296 


CN 

cn 




1503 


Terminal 
(nt) 


2485269 


2485733 


2485801 


2486477 


2486910 


2487912 


2489573 


2491732 


go 
oo 

CNCM 
OO 

cncn 

CNCN 


2491151 


2491873 


2492501 


2493215 


cn 

CO 
CO 
T 

cn 

CM 


2495696 


2497513 


2498009 


Initial 
(nt) 


2484661 


2485473 


2486469 


2486881 


2487884 


2489450 


2490154 


2490911 


COCO 
OCM 


2491858 


2492343 


2493178 


2494237 


2495634 


2496607 


2496803 


2499511 


SEQ 
NO. 
(a.a.) 


6068 


6069 


6070 


6071 


6072 


6073 


6074 


6075 


too 

co 
tcto 


6077 


6078 


6079 


6080 


6081 


6082 


6083 


6084 


' ° ri % 
jiijOz 

W2Q 


2568 


cn 
to 
m 

CN 


2570 


r- 

LT> 
CM 


2572 


2573 


2574 


2575 


coco 
inn 

CVTN 


2577 


2578 


2579 


2580 


2581 


i £ 

I S 
I N 


i CO 

I *° 
|cm 


s 
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3 



QJ 

i 05 



Function 
xanthine permease 


2,5-diketo-D-gluconic acid reductase 




SOS ribosomal protein L27 


SOS ribosomal protein L21 
ribonuclease E 


hypothetical protein 


transposase (insertion sequence 
IS31831) 


hypothetical protein 


hypothetical protein 
nucleoside diphosphate kinase 


hypothetical protein 


hypothetical protein 


hypothetical protein 




Matched 
length 
(aa) 

422 


CD 
fN 




T — 

CO 


t- CO 
O CO 
CO 


1 in 

l CD 

* ^» 


CD 
CO 




CO 

CO 


CN 
CO 


CN 


CO 


Similarity 
(%) 

77 3 


! cn 
co 




92.6 


CN CO 
CN CO 
CO LD 


1 CD 

1 CN 
• CO 


100.0 


76.9 


CO CO 

r*-' oS 
CO 00 


67.4 


64.3 


68.6 


Identity 
(%) 


61 2 




80.3 


56.4 
30.1 


61.0 


99.1 


51.3 


37.8 
70.9 


34. B 


36.6 


33.9 


Homologous gene 


o 

! f 

1 5 
tn cn 

- c s 

ro o 
D O ro 




Streptomyces griseus IF013189 
rpmA 


Streptomyces griseus IF013189 
obg 

Escherichia co!i K12rne 


c T co' 

<J< 

v. 1— 

c 0 
"c 0 

CO 

o*3 
c 0 

C CJ 
it if* 
0 0> 

coo 
5 >>co 

(O CO 

2£u 

(/CO CO 


Corynebacterium glutamicum ' 
ATCC 31831 


Streptomyces coelicolor A3(2) 
SCF76.08c 


Streptomyces coelicolor A3(2) 
SCF76.09 

Mycobacterium smegmatis ndk 


Deinococcus radiodurans R1 
DR1844 


Mycobacterium tuberculosis 
H37RV Rv1883c 


Mycobacterium tuberculosis 
H37Rv Rv2446c 


db Match 


spPBUX^BACSU 
pir: 140838 




cr 
O 
cr 

co 

,J 

CN 
—1 

cr 

CL 
i/i 


_j 

< o 

to O 

5 4 

co 2: 
cn cr 
t cL 


CO 

LU 
O 
CO 

CL 

cn 


pir:S43613 


CO ( 

CD 

LL. 
O 
CO 

cn 


co' 3 

tr g 

O u. 
co < 

CL CL 

cn cn 


gp:AE002024J0 


pir:H70515 


pir.E70863 


ORF 
(bp) 


1887 


CN 

to 


tD 

CD CD 
CO CN 


303 

2268 
"549 


CO f*- <Ji 
^ 0 
CO h- CO 


1308 


CO 
CO 


0 co 
in 0 


O CN 
CD *T 
CO CO 


LO 
CD 


CO 
CN 




Terminal 
(nt) 


2501669 
2501735 


2503355 


2504265 
2503984 


2504300 

2504831 
2507663 


2507710 
2508840 


2509530 


2509523 


2511423 


2511876 
2511949 


2512409 
2513144 


2513154 


2513692 




Initial 
(nt) 


2499783 
2502577 


2502735 


2503670 
2504247 


2504602 
9507115 


2507138 
2508094 


2508922 


2510830 


to 
0 

in 

CN 


2511427 


2512768 
2512803 


2513618 


2514114 


i 


Sg s 


6085 
6066 


6087 


6068 
6089 


6090 
6091 


6093 
1*094 


6095 ' 


6096 


6097 


8609 


6100 
6101 


6102 


6103 


SEQ 
NO. 
(DNA) 


lD CO I 
CO CO * CO 

ld in in 1 

CM CN CN 


2586 
2589 


2590 
2591 

OCQO 


2593 
2594 


2595 
2596 


i 

2597 


2598 


2600 
2601 


2602 


2603 



EP 1 
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5 
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Function 


folyl-polygtutamate synthetase 








valyl-tRNA synthetase 


oligopeptide ABC transport system 
substrate-binding protein 


heat shock protein dnaK 


lysine decarboxylase 


malate dehydrogenase 


transcriptional regulator 


hypothetical protein 


vanillate demethylase (oxygenase) 


pentachtorophenol 4- 
monooxygenase reductase 


transport protein 


malonate transporter 


class-Ill heat-shock protein or ATP- 
dependent protease 


hypothetical protein 


succinyl CoA3-oxoadipate CoA | 
transferase beta subunit i 


succinyl CoA'3-oxoadipate CoA 
transferase alpha subunit 


15 


Matched 
length 
(a.a) 


tn 








m 
o> 


CM 

tn 


oo 
o 
in 


o 

r- 


cn 

^- 

co 


r r— 
c o 

C CM 


CO 
O 
CM 


m 

CO 


GO 
CO 
CO 


TT 


CD 
GO 
CM 


o 

CO 


CO 
CD 
CO 


o 

CM 


to 

CM 












































Similarity 
(%) 


to 










tn 


cn 


CM 


to 


t tn 




to 


CM 


CO 


T 


CO 


o 


r- 


to 


20 


oS 








CM 
r- 


CO 

to 


to 




to 
r— 


c to' 
t to 


in 


CO 
CO 


oS 
tn 


CO 
r- 


CO 

m 


tri 

CO 


CO 


tri 

GO 


CO 




Identity 
(%) 


T 








to 


CM 


CM 


cn 


<*r 


c to 


o 


m 


CO 


CO 


o 


CO 


CD 


CO 


CM 




tn 

m 








to" 


CM 


CO 
CM 


CM 


to 
to 


C CM 


to 

CM 


Ol 
CO 


cm' 

CO 


o 


CO 
CM 


cn 
to 


m' 


CO 
CO 


O 
CO 


to ro 

Table 1 (continued) 


Homologous gene 


Streptomyces coelicolor A3(2) 
folC 








Bacillus subtilis 168 balS 


Bacillus subtilis 168oppA 


Bacillus subtilis 166 dnaK 


Eikenella corrodens ATCC 
23824 


Thermus aquaticus ATCC 33923 
mdh 


cmV 
coo* 

« 

oo 
oo 
oo 

"oo 
oo 
oo 
trttn 
oo 

OO CO 

»>.«o 

EE o 
oo 

nn < 

oo 

-fct: O 

W/) C/> 


Vibrio cholerae aphA 


Acinetobacter sp. vanA 


Sphingomonas flava ATCC 
39723 pcpD 


Acinetobacter sp. vanK 


Klebsiella pneumoniae mdcF 


Bacillus subtilis cIpX 


Streptomyces coelicolor A3{2) 
SCF55.28c 


Streptomyces sp. 2065 pcaJ 


Streptomyces sp. 2065 peal 


35 
40 


db Match 


2410252B 








CO 
CD 

>' 

>- 

co 


A38447 


DNAK_BACSU 


i 

Ol 

CO 

3 
O 
LU 


—i 
LL. 
LU 

X 

X" 
Q 
5 


C CO 
C CO 

c o 

« < 

C O 
t U) 


*~i 

CM 

TT 

tn 
to 
o 

< 


2513416F 


cn 

CM 
CM 

CO 

u. 


2513416G 


iJ 

CO 

o 
tn 

CO 

3 

a 

bC 


2303274A 


CO 
CN ( 

to 
to 

IX 

u 
to 


CM ( 

CD 
CO 
CO 

cn 
o 

LU 
< 


AF109386J 






t: 
o. 




i 




sp: 


pir: 


sp: 


§i 


:ds 


i CL 

i cn 


CL 

cn 


I prf 


b_ 
cn 


prf 


CL 

cn 


nrf 

prr 


CL 

cn 


cn 


CL, 

cn 




it 


1374 


CM 

to 


^* 


CO 

to 

CO 


2700 


1575 


1452 


tn 

CO 

m 


GO 

cn 


r 
r 

r r- 


CD 

in 


1128 


tn 
cn 


1425 


o 

CO 

cn 


1278 


1086 


CO 
CO 
CD 


o 
to 


45 


Terminal 
(nt) 


2514114 


2516273 


2516956 


2517751 


2515637 


2518398 


2521660 


2521667 


2522265 


r h- 

C CO 
C CO 

- T 

C CM 

t to 

C CM 


2524340 


2526226 


2527207 


2528559 


2528551 


2529484 


2531976 


2531969 

I 


2532604 


50 


Initial 
(nt) 


2515487 


2515662 


2516243 


2517089 


2518336 


2519972 


2520209 


2522251 


2523248 


C CD 

t in 

C CO 
C CM 

t m 

C CM 


2524915 


2525099 


2526233 


2527135 


2529480 


2530761 


2530891 


2532601 


2533353 




SEQ 
NO. 
(a.a.) 


i S 


tn 
I <=> 


* ° 


r— 
o 


CO 

o 


cn 
o 


o 




CM 


C CO 


t 


tn 


• CO 


r- 


CO 


cn 


I o 

CM 


CM 


CM 
CM 




i to 


I <° 


Ico 


to 


CO 


CO 


to 


CO 


CO 


C CO 


CD 


CO 


CD 


CD 


CD 


CO 


CO 


CO 


CO 


55 


SEQ 
NO. 
(DNA) 


o 
to 
cm 


2605 


2606 


2607 


2608 


2609 


o 

CO 
CM 


2611 


2612 


2613 


2614 


2615 


2616 


2617 


2618 I 


2619 


2620 


2621 


2622 
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15 



20 



25 ^ 

0) 
3 



O 
O 



JO 



45 



55 



Function 


protocatechuate catabolic protein 


beta-ketothiotase 




3-oxoadipate enol-lactone hydrolase 
and 4-carboxymuconolactone 
decarboxylase 


transcriptional regulator 


3-oxoadipate enol-lactone hydrolase 
and 4-carboxymuconolactone 
decarboxylase 




3-carboxy-cis.cis-muconate 
cycloisomerase 


protocatechuate dioxygenase alpha 
subunit 


protocatechuate dioxygenase beta 
subunit 


hypothetical protein 


muconolactone isomerase 




muconate cycloisomerase 




catechol 1,2-dioxygenase 




toluate 1,2 dioxygenase subunit 


Matched 
length 
(aa) 


LO 
CN 


CO 

o 




CO 

in 

CN 


to 

CN 
CO 


LO 




co 


CN 


r- 
CN 


CO 

r-- 

CM 


CM 
O) 




CM 

CO 




in 

CO 
CM 




CO 


Similarity 
(%) 


82.5 


71.9 




76.6 


43.0 


89.6 




63.4 


70.6 


91.2 


48.7 


81.5 




CO 




88.4 




85.6 


Identity 
{%) 


58.2 


44.8 




CO 

o 
in 


23.6 


78.3 




39.8 


49.5 


74.7 


26.4 


54.4 




60.8 




72.3 




62.2 


Homologous gene 


Rhodococcus opacus 1CP pcaR 


Ralstonia eutropha bktB 




Rhodococcus opacus pcaL 


Slreptomyces coelicolor A3(2) 
SCM1.10 


Rhodococcus opacus pcaL 




Rhodococcus opacus pcaB 


Rhodococcus opacus pcaG 


Rhodococcus opacus pcaH 


Mycobacterium tuberculosis 
H37Rv Rv0336 


Mycobacterium tuberculosis 
catC 




Rhodococcus opacus 1CP catB 




Rhodococcus rhodochrous catA 




Pseudomonas putida plasmid i 
pDK1 xylX 


db Match 


UL 
rr 
CN 
CO 
CO 

o 

CN 

iz 

Q_ 


prf:2411305D 




UJ 
T 
CN 
CO 
CO 

o 

CN 
t= 

cx 


o 

B 

o 

cn 
Cl 
cn 


prf:2408324E 




prf:2408324D 


o 

CN 
CO 
CO 

o 

CN 
CL 


m 

CM 
CO 
CO 

o 

CN 

t: 

D_ 


pir:G70506 


prf:2515333B 




0. 
O 

O 

X 

5 

Si 

o 

CL 




prf: 25032 18A 




CO 

CO 

CO 

LU 
< 
CL 
CO 


si 


CN 
CO 

r-. 


1224 


CN 

5 


CO 
LO 


2061 


CO 
CO 
CO 


CO 

r- 
to 


1116 


CN 

to 


o 
Oi 
CD 


1164 


o> 

CM 


r*- 


1119 


CD 
O 

to 


m 
m 

CO 




1470 


Terminal 
(nt) 


2534182 


2535424 


2534257 


2536182 


2538256 


2538248 


2540230 


2538616 


2539709 


2540335 


2541187 


2542512 


2543813 


2542818 


2544867 


2544022 


2544928 


2546784 


Initial 
(nt) 


2533391 


2534201 


2535168 


2535430 


2536196 


2538613 


2539553 


2539731 


2540320 


2541024 


2542350 


2542802 


2543043 | 


2543936 


2544262 


2544B76 


2545068 , 


2545315 




6123 


6124 


6125 


6126 


6127 


6128 


en 

CN 

to 


6130 


CO 

to 


61 32 1 


6133 


T 

CO 

to 


6135 


6136 


6137 


6138 


6139 


6140 


SEQ 
NO. 
(DNA) 

2623 


2624 


LO 

CN 
CO 
CN 


2626 


2627 


2628 


2629 


2630 


2631 


2632 
2633 


2634 


2635 i 


2636 


2637 ! 


2638 


2639 


2640 
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5 
10 


Function 


toluate 1,2 dioxygenase subunit 


toluate 1,2 dioxygenase subunit 


1,2-dihydroxycyclohexa-3.5-diene 
carboxylate dehydrogenase 


regulator of LuxR family with ATP- 
binding site 


transmembrane transport protein or 
4-hydroxybenzoate transporter 


benzoate membrane transport 
protein 


ATP-dependent CIp protease 
proteolytic subunit 2 


a><u 
tntn 
as a> 
a><u 

oo 
qq. 
ao. 

CX> •"*= 

«5 
oa> -= 
■on £ 
cc „ 

(DCl) U 

oar 

QJQJ ^ 

■cro o 

i 1 OJ 

« a. 


hypothetical protein 


trigger factor (prolyl isomerase) 
(chaperone protein) 


hypothetical protein 


penicillin-binding protein 


hypothetical protein 




transposase 




hypothetical protein 


transposase 


15 


Matched 
length 
(a-a) 


CO 


CN 
CO 


CN 


cn 
r- 
cn 


tn 

CO 


CO 

CO 
CO 


cn 


C CO 

c o> 


CM 




o 

CO 


CO 
CO 
CO 


in 




CM 




in 

CO 


tn 


20 


Similarity 


83.2 


81.0 


61.4 


48.6 


64.4 


66.2 


88.3 


85.9 


71.4 


66.4 


CO 
CO 


50.9 


58.3 




73.2 




CO 
CM 
CO 


78.7 




Identity 

{%) 1 


60.3 


51.5 


30.7 


23.3 


31.3 


29.9 


69.5 


62.1 


42.9 


32.1 


32.5 


25.3 


27.8 




54.2 




57.1 


50.7 


25 

CD 

c 
o 
o, 

30 

CO 

.o 

35 
40 


Homologous gene 


Pseudomonas putida plasmid 
pDK1 xylY 


Pseudomonas putida plasmid ; 
pDK1 xylZ I 


Pseudomonas putida plasmid 
pDK1 xylL 


Rhodococcus erythnopolis thcG 


Acinetobacter calcoaceticus 
pcaK 


Acinetobacter calcoaceticus 
benE 


Streptomyces coelicolor M145 
clpP2 


Streotomvces coelicolor Ml 45 
Streptomyces coeltcolor M145 

clpP1 


Sulfolobus islandicus ORF154 


Bacillus subtilis 168 tig 


Streptomyces coelicolor A3(2) 
SCD25.17 


Nocardia lactamdurans LC411 
pbp 


Mus musculus Moa1 




Corynebacterium striatum ORF1 




Corynebacterium striatum ORF1 


Corynebacterium striatum ORF1 


db Match 


CO 
CO 

co 

< 

cL 
cn 


gp:AF134348_3 


co 

CO 

U_ 
< 

cL 
cn 


o' 

r- 

m 
cn 

LU 

oz 

O. 

cn 


< 
o 
o 

5 

< 

o 
o. 

CL 

to 


< 
o 
u 
< 

l 

LU 
Z 
LU 
CD 

icL 
v> 


gp:AF071885_2 


gp:AF071885_1 


gp:SIS243537_4 


sp:T!G_BACSU 


gp:SCD25J7 


5 

o 
o 

1 

^* 

CL 
CD 
CL 

CL 
to 


prf:2301342A 




prf:2513302C 




prf:2513302C 


prf:2513302C 




Si 


CN 

cn 


1536 


CO 
CM 
CO 


2685 


1380 


1242 


CM 
CO 


t CO 

l o 

t CD 


o 
m 


1347 


in 
cn 


in 
cn 


CO 

tn 


cn 

CM 


CO 
CO 


o 
m 
^— 


CO 
CN 


"<T 
CO 
CM 


45 


Terminal 
(nt) 


2547318 


2548868 


2549695 


2552455 


2553942 


2555267 


2555317 


2555978 


2556748 


2556760 


2559103 


2560131 


2560586 


2561363 


2561483 


2562242 | 


2561990 


2562078 


50 


Initial 
(nt) 


2546827 


2547333 


2548868 


2549771 


2552563 


2554026 


2555940 


2556580 


2556599 


2558106 


2558609 


2559157 


2560131 


2561115 


2561920 


2562093 


2562115 


2562341 




SEQ 
NO. 
(a a.) 


6141 


6142 


6143 


CO 


6145 


6146 


6147 


i CO 

• ^ 

i CO 


cn 

CO 


6150 


6151 


6152 


CO 
CO 


6154 


6155 


6156 


6157 


6158 


55 


SEQ 
NO. 
(DNA) 


t- 

CO 
. CN 


2642 


2643 


*T 
*«T 
CO 
CN 


2645 


2646 


2647 


2648 


2649 


2650 


2651 


2652 


2653 


2654 


2655 


2656 


2657 


[2658 
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Function 






galactose-6-phosphate isomerase 


hypothetical protein 


hypothetical protein 


aminopeptidase N 


hypothetical protein 








CO ) 

(/> 1 

CO ] 

3 ! 
*co i 

c/l > 
0) ) 

-o I 

CD ) 

c : 

0) ) 

o > 
"*>*^ 
jc : 

CXI 






phytoene dehydrogenase 


phytoene synthase 


multidrug resistance transporter 




ABC transporter ATP-binding protein 


dipeptide transport system 
permease protein 


nickel transport system permease 
protein 




15 




Matched 
length 
(aa) 






o 


CO 
T 

CM 


cn 


o 
cn 

CO 


CO 

m 

CO 








O ) 






CO 
CO 


a 
o 

CM 


CM 
Cft 
CO 




CO 

CO 

m 


to 

CO 
CM 


CO 
CO 




















in 
















CO 


CO 


r-» 




CD 


CO 


o 




20 




In 








CO* 

in 


o 

CO 


o 

r-> 


CO* 

m 








OO ) 






CO 
CO 


CO 

to 






r*» 


CO 


CM 
CO 








>■» 






o 


CM 


CO 


m 










un > 






CM 


TT 


CO 




co 


CO 


CM 














o 


CO 
CM 


CO 

m 


r-" 


to 

CM 








to > 






CO 


CO 


m 

CM 






CO 
CO 


CO 
CO 




25 
30 
35 
40 


Table 1 (continued) 


Homologous gene 






Staphylococcus aureus NCTC 
8325-4 lacB 


Bacillus acidopullulyticus ORF2 


Mycobacterium tuberculosis 
H37Rv Rv2466c 


Streptomyces lividans pepN 


Borrelia burgdorferi BB0852 








Brevibacterium linens ATCC 
9175 crtl 






Myxococcus xanthus DK1050 
carA2 


Streptomyces griseus JA3933 
crtB 


Listeria monocytogenes IltB 




Synechococcus elongatus 


Bacillus firmus 0F4 dppC 


Escherichia coli K12 nikB 






db Match 






co 

m 1 

o 

5 

cL 
m 


o 
< 

< 

CD 
>' 

< 
> 

in 


pir:A70866 


—i 
tr 
\- 
co 

1 

z 

CL 

< 

id. 

C/l 


pir:B70206 








CO 1 

1 1 

CO 3 

cn o 

O) 0 

co •> 

LL - 

< c 

CT1D 






> 

C£ 
O 

CL 

(ft 


DC 

o 
rr 
i- 
co 

co' 
l~ 
tr 

o 

CL 
«/) 


CO 

t 

r- 

CM 
CO 

cn 

3 

_i 

CL 
O) 




gp:SYOATPBP_2 


LL 
O 
< 
CD 

» 

O 
Q_ 
Q. 
O 

CL 

t/i 


to 
cn 

CO 

r*- 

to 

*CL 










o 
cn 

CO 


in 

CO 
CO 




CO 
CO 

to 


o 
to 


2601 


1083 


1152 


CO 

to 

CO 


CO 

tn 


- 

CM vJ 
CO 0 




CO 
CO 


1206 


CO 
CO 


1119 


1233 


1641 


CM 
CO 
CO 


cn 

CO 

en 


1707 


45 




Terminal 
(nt) 


2562387 


2563847 


2563932 


2564550 


2565623 


2568945 


2570293 


2570309 


2572175 


2572348 


in ■> 

CO 1 
CM M 

in ■> 

CM M 


2572807 


2573393 


2572659 


2573843 


2574780 


2575981 


2577232 


2578879 


2579769 


2580711 


50 




Initial 
(nt) 


2562776 


2562963 


2564402 


2565245 


2566231 


2566345 


2569.211 


2571460 


2571510 


2572193 


r-. - 
- 

CO 3 
CN n» 

r«- - 
in o 

CM M 


2572977 1 


2573770 


2573864 


2574718 


2575898 


2577213 


2578872 1 


2579760 


2580707 


2582417 








o> 
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o 
to 


to 


(M 
CO 


CO 
CO 


CO 


in 

CO 


to 

CO 


r- 
co 


CO 
CO 


cn n 

CO 0 


o 


h- 


CM 

r> 


I CO 


r- 


*n 
r- 


CO 




CO 


cn 
r- 








i 50 


to 


to 


CO 


to 


to 


CO 


CO 


CO 


to 


CO D 


CO 


CO 


CO 


I ^ 


CO 


co 


to 


to 


CO 


co 


55 
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NO. 
(DNA) 


2659 


2660 


2661 


2662 


2663 


2664 


2665 


2666 


2667 


2668 


cn n 
to D 

CD D 
CM >J 


2670 


2671 


2672 


2673 


2674 


2675 




2677 


2678 


2679 
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Function 




acetylornithine aminotransferase 


hypothetical protein 


hypothetical membrane protein 


acetoacetyl CoA reductase 


transcriptional regulator, TetR family 


polypeptides predicted to be useful 
antigens for vaccines and 
diagnostics 


ABC transporter ATP-binding protein 


globin 


chromate transport protein 


hypothetical protein 


hypothetical protein 




hypothetical protein 


ABC transporter ATP-binding protein 


hypothetical protein 


hypothetical membrane protein 


alkaline phosphatase ] 


Matched 
length 
(aa) 






CM 
CO 


CO 

CM 


LO 
CO 
CM 


o 

CM 


o> 


CO 
CO 
CM 


CO 
CM 


CO 

cn 

CO 


CO 

cn 


CN 




m 
in 


CO 
CO 

m 


CN 


o 


CO 
CO 

in 


milarity 
(%) 




63.5 


47.9 


79.4 


O 

o 

CO 


55.0 


o 
^* 


iri 

CO 


77.0 


60.4 


68.9 


61.4 




60.0 


CO 

cn 


62.2 


56.7 


52.6 


to 






































Identity 
(%) 




31.4 


25.1 


49.1 


28.1 


26.7 


38.0 


31.1 


53.2 


27.3 


37.8 


36.2 




36.4 


52.8 


31.4 


28.0 


28.0 


Homologous gene 




Corynebacterium glutamicum 
ATCC 13032 argD 


Mycobacterium tuberculosis 
H37RvRv1128c 


Mycobacterium tuberculosis 
H37Rv Rv0364 


Chromatium vinosum D phbB 


Streptomyces coelicolor actll 


Neisseria meningitidis 


Pseudomonas putida GM73 > 
ttg2A 


Mycobacterium leprae 
MLCB1610.14c 


Pseudomonas aeruginosa 
Plasmid pUM505 chrA 


Mycobacterium tuberculosis 
H37Rv Rv2474c 


Streptomyces coelicolor A3(2) 
SC6D10.19c 




Aeropyrum pernix K1 APE1 182 


Escherichia coli K12yjjK 


Mycobacterium tuberculosis 
H37Rv Rv2478c 


Mycobacterium leprae o659 


Bacillus subtitis phoB 
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ptr:A70539 


id 
i— 
o 
>- 
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CM 
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>- 
CL 
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I 

o 

< 
Q. 
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o 
o 

CO 

o 

* — 

LL 

< 
Cl 
cn 


o' 

CO 
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_j 
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o 

LU 
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_j 
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>- 

J 

in 
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cL 
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1941 
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CO 
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co 
ro 
r» 


*r 


CM 
O) 

r» 


ro 
O) 

ro 


1 
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CN 
CO 


LO 
CO 

t 


CM 
CO 


CM 
CD 
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in 

CO 


2103 


1419 


Terminal 
(nt) 


2584504 


2585926 


2587763 


2588722 


2588725 


2590302 


2591137 


2591574 


2592794 


2593965 
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2594597 


2595188 


2595822 


2596048 


2597869 


2598662 


I 2602879 
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(nt) 


2582564 


2584613 


2586180 


2587976 


CN 
CO 

cc 

if 

CN 


2589565 


2590697 


2592365 


2592402 


2592838 


2594594 


2595061 


2595808 


2595983 


2597715 


2598483 


2600764 


2601461 
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CC 
IE 
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6189 
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6191 
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OI 
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2691 


CM 

cn 
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CN 


C 
CT 
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cn 
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o 
o 



Function 


ferric enterochelin esterase 


lipoprotein 








transposase(IS1207) 






transcriptional regulator 


glutaminase 


sporulation-specific degradation 
regulator protein 




uronate isomerase 




hypothetical protein 


pyrazinamidase/nicotinamidase 


hypothetical protein 


bacterioferritin comigratory protein 


bacterial regulatory protein, tetR 
family 


Matched 
length 
(aa) 


LO 

TJ" 


CO 
CO 
CO 








CO 
CO 
TT 






CO 


CO 
LO 
CO 


e'- 
en 




m 

CO 
CO 




CM 


m 

CO 


in 
r- 


TT 




milarity 
(%) 


CO 
O 

tn 


71.9 








99.8 






63.4 


69.3 


72.2 




60.9 




45.0 


74.6 


80.0 


73.8 


61.4 


CO 








































IS 


260 


48.5 








99.5 






32.8 


35.2 


42.3 




29.0 




32.0 


48.1 


42.7 


46.8 


32.5 


Homologous gene 


Salmonella enterica iroD 


Mycobacterium tuberculosis 
H37Rv Rv2518c IppS 








Corynebacterium gtutamicum 
ATCC 21086 






Salmonella typhimurium KP1001 
cytR 


■ 

LU 

3 

o 
< 
oi 

co 

tn » 
=i LOU 

^ zz 

0 » 
c "'" 
to 1 

£ 55 

to <<C 

01 OD 


Bacillus subtilis 168 degA 




Escherichia coli K12 uxaC 




Zea diploperennis perennial 
teosinte 


Mycobacterium avium pncA 


Mycobacterium tuberculosis 
H37Rv Rv2520c 


Escherichia coli K12 bcp 


Streptomyces coelicolor A3(2) 
SCI11.01C 


db Match 


prf:2409378A 


pir:C70870 








gp:SCU53587_1 






gp:AF085239_1 


S: 
cc : 

CO ) 

-J J 

O ) 

CL i. 
to ) 


pirA36940 




_> 

o 

CJ 
LU 

I 

O 
3 

CL 

to 




prf:1B14452C 


< 

TT 

■*r 

CM 
CO 
CM 

h— 

o. 


pir:E70870 


i 

O 

u 

LU 

a' 

o 

CO 

id. 
to 


gp:SCI11J 


ORF 
(bp) 


1186 


1209 


in 

to 


o 
m 


to 

ICN 


1308 


o 

CM 


CO 
CO 
CO 


CO 

in 


cn > 

CM 4 
CO 5 


r-~ 

r-- 


cn 

LO 

m 


in 
m 


o 

LO 


1197 


CO 

in 
tn 


p 

CM 


in 

CO 
TT 


CO 
CO 
CD 


Terminal 
(nt) 


2619541 


2620973 


2623605 


2623621 


2624048 


2624051 


2625806 


2625809 


2628376 


CO > 

cn -> 
r 

CO D 
CM *J 
CO ? 
CM x! 


2628852 


2628324 


2630479 


2631136 


2632466 


2633100 


2633146 


2634064 


2634751 


Initial 
(nt) 


2620728 


2622181 


2622961 


2623770 


2623803 


2625358 


2625600 


2626447 


2627924 


CM st 

CO 3 
CM M 
CO 3 
CM s| 


1 

2628376 


2628878 


2628926 


2630636 


2631270 


2632543 


2633418 


2633600 


2634116 


So ™ 


6216 


6217 


6718 


6219 


fi??0 


6221 


6222 


6223 


6224 


in i 

CM x] 
CM si 
CO 3 


6226 


6227 


6228 


6229 


6230 


6231 


6232 


6233 


6234 


O 0 < 


* to 


2717 


2718 
2719 
2720 

2721 


CM ' CO 

cm ; CM 
r- r- 

CM CM 


2724 


LO ") 
CM M 

r*- - 

CM M 


2726 


2727 


2728 


2729 


2730 


12731 


2732 


2733 


2734 
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tein 


otein 


protein 










protein 


protein 












protein 








10 


Function 


phosphopantethiene pre 
transferase 


lincomycin resistance pr 


hypothetical membrane 




fatty-acid synthase 


hypothetical protein 


peptidase 


hypothetical membrane 


hypothetical membrane 


hypothetical protein 


ribonuclease PH 








hypothetical membrane 


transposase (IS1628) 




arylsulfatase 


15 


Matched 
length 
(aa) 


to 


CO 

r- 


CO 




3029 


o 

T 
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CO 
CN 


CN 


CO 
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CN 
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CO 
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CO 

cn 


O 
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ai 

CN 


CN 

o> 




to 


§ Si 8 K 
Table 1 (continued) 


Homologous gene 


Corynebacterium 
ammoniagenes ATCC 6871 ppt1 


Corynebacterium glutamicum 
ImrB 


Synechocystis sp. PCC6803 




Corynebacterium 
ammoniagenes fas 


Streplomyces coelicolor A3(2) 
SC4A7.14 


Mycobacterium tuberculosis 
H37Rv Rv0950c 


Mycobacterium tuberculosis 
H37RV RV1343C 


Mycobacterium leprae 
B1549_F2_59 


Mycobacterium tuberculosis 
H37Rv Rv1341 


Pseudomonas aeruginosa 
ATCC 15692 rph ! 








Mycobacterium tuberculosis 
H37Rv SC8A6.09c 


Corynebacterium glutamicum 
22243 R-plasmid pAG1 tnpB 




Mycobacterium leprae ats 


db Match 


CO 

o 

CO 
CD 


AF237667J 


S76537 




S2047 


< 

O 
CO 


D70716 


\— 
o 
> 

r— ' 
r*- 
o 
> 


Y076_MYCLE 


Y03Q_MYCTU 


LU 
< 
LU 
CO 
CL 

1 

X 

Ol 

tr 








z> 

1- 
O 
>- 

*' 

CN 
O 
>- 


.AF121000_8 




Y03O_MYCLE 






CL 

o> 


CL 
O) 


CL 




CL 


CL 

cn 


CL 


CL 

tn 


CL 
CO 


CL 
tn 


CL 

CO 








CL 
CO 


CL 

cn 




CL 

CO 




ORF 
(bp) 


LO 

o 


1425 


CN 
CO 




8979 


1182 


CO 
CO 


CN 
CO 


*T 

CO 
CO 


CO 
CD 


CO 

ro 
f- 


CO 
CN 


CO 
CO 
CO 


CN 
CO 
LO 


1362 


■<r 

CO 
CO 


o 

CO 
CO 


LO 

fe 


45 


Terminal 
(nt) 


2634747 


2635165 


2637168 


2637240 


2638649 


2648235 


2650164 


2650902 


2651339 


2651420 


2652067 


2653009 


2653326 


2654079 


2654875 


2656985 


2656974 


2657736 


50 


Initial 
(nt) 


2635151 


2636589 


2636845 


2637653 


2647627 


2649416 


2649550 


2650441 


2650986 


2652037 


2652801 


2653254 


2654018 


2654660 


2656236 


2656452 


2657633 


2658500 




SEQ 
NO. 
(aa.) 


6235 


6236 


6237 


6238 


6239 


6240 


I „_ 

CN 
CO 


6242 


5 

CN 
CO 


6244 


6245 


6246 


6247 


6248 


6249 


6250 


6251 


6252 


55 


SEQ 
NO. 
(DNA) 


2735 


2736 


2737 


2738 


2739 


2740 


2741 


2742 


2743 


2744 


2745 


2746 


[2747 j 


CO 

5 


2749 


2750 


2751 


2752 
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10 


Function 


D-glutamate racemase 




bacterial regulatory protein, marR 
family 


hypothetical membrane protein 




endo-type 6-aminohexanoate 
oligomer hydrolase 


hypothetical protein 


hypothetical protein 






hypothetical protein 




ATP-dependent helicase 


hypothetical membrane protein 


hypothetical protein 


[phosphoserine phosphatase 




cytochrome c oxidase chain I 




15 


Matched 
length 


■<r 

co 

CN 




r— 


to 

CM 
CM 




CM 
CO 


o 
o 

CM 


in 
o 






00 
CM 

"cr 




r- 
^* 

CO 


CO 
r— 

CO 


CM 
CM 
CM 


o 

CO 




in 
in 




20 


Similarity 
(%) 


99.3 




70.8 


69.3 




58.3 


in 

CO 
LO 


77.1 






80.8 




CO 

CO 

tn 


60.1 


52.0 


61.0 




74.4 






Identity 
(%) 


99.3 




44.2 


38.2 




30.2 


35.0 


57.1 






61.2 




25.2 


29.7 


39.0 


38.7 




CO 

CO 




25 

CD 
3 
C 

O 

30 

<D 

35 


Homologous gene 


Corynebacterium glutamicum 
ATCC 13869 murl 




Streptomyces coelicolor A3(2) 
SCE22.22 


Mycobacterium tuberculosis 
H37Rv Rv1337 




Flavobacterium sp. nylC 


Mycobacterium tuberculosis 
H37Rv Rv1332 


Mycobacterium tuberculosis 
H37Rv Rv1331 






Mycobacterium tuberculosis 
H37Rv Rv1330c 




Escherichia coli dinG 


Mycobacterium tuberculosis 
H37Rv Rv2560 


Streptomyces coelicolor A3(2) 
SC1B5.06C 


Escherichia coli K12serB 




Mycobacterium tuberculosis 
H37RV Rv3043c 




40 


db Match 


pr1:2516259A 
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2663437 


2664060 
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rotein 


rotein 






major secreted protein PS1 protein 
precursor 










1 


symport 








ding protein 








rogenase 
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Function 


hypothetical membrane pi 


hypothetical membrane pi 


hypothetical protein 


Iransposase (IS1676) 




I 




transposase(IS1676) 




proton/sodium-glutamate 
protein 




ABC transporter 




ABC transporter ATP-bim 


hypothetical protein 


hypothetical protein 




oxidoreductase or dehydi 
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Matched 
length 
(aa) 


GO 


CN 
CN 
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CN 


CD 
Oft 


LO 

m 

CO 
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o 
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CO 
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Similaril 
(%) 


64.3 


61.5 


79.1 


48.6 


49.6 








46.6 




66.2 
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oi 
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79.8 


67.0 


75.0 




54.1 
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33.0 
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o 

CO 
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25 

CD 
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_C 

O 

30 

0) 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv3069 


Helicobacter pylori J99 jhp1146 


Bacillus subtilis 168 ycsl 


Rhodococcus erythropolis 


Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
17965 cspl 








Rhodococcus erythropolis 




Bacillus subtilis 168 




Streptomyces coelicolor A3(2) 
SCE25.30 




Staphylococcus aureus 


Chlamydophila pneumoniae 
AR39 CP09B7 


Chlamydia muridarum Nigg 
TC0129 




Streptomyces collinus Tu 1892 
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cn 
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lO 
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cn 

CO 
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CN 

cn 

CN 
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6295 
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Function 


methyltransferase 


hypothetical protein 


hypothetical protein 




UDP-N-acetylglucosamine 1- 
ca r boxyvi n y It ra n sf era se 


hypothetical protein 


transcriptional regulator 




cysteine synthase 


O-acetylserine synthase 


hypothetical protein 


succinyl-CoA synthetase alpha 
chain 


hypothetical protein 


succinyl-CoA synthetase beta chain 




frenolicin gene E product 




succinyl-CoA coenzyme A 
transferase 


transcriptional regulator 


Matched 
length 
(aa) 
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Identity 
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61.0 


71.0 




44.8 


66.3 


45.9 




57.1 


61.1 


36.1 


52.9 


42.0 


39.8 




38.5 




o> 


38.6 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv0089 


Chlamydia pneumoniae 


Chlamydia muridarum Nigg 
TC0129 




Acinetobacter calcoaceticus 
NCIB 8250 murA 


Mycobacterium tuberculosis 
H37Rv Rv1314c 


Streptomyces coelicolor A3(2) 
SC2G5.15C 




Bacillus subtilis 168 cysK 


Azotobacter vinetandii cysE2 
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DR1844 


Coxiella burnetii Nine Mile Ph I 
sucD 
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r **r 

5 CO 
•) o 

- CO 
J CN 

- r- 
4 CN 


2724057 


2725359 


2725619 


2726577 


2727145 


2728133 


2729025 


2730916 


2731376 


2g 3 


6309 


6310 


6311 


6312 


6313 


6314 


6315 


6316 


6317 


: 5 

■) ! CO 

> to 


6319 


6320 


6321 


6322 


6323 


CM 
CO 
CO 


6325 


6326 


6327 


Sol 


2809 


a 

CO 
CN 


2811 


CN 

CO 

CM 


2813 


2814 


2815 


2816 


I s — - | CO 

CO 3 | CO 
CN M CN 


— 

2819 ! 


2820 


2821 


2822 


2823 


2824 


in 

CN 
CO 
CN 


2826 


2827 



EP 1 108 790 A2 













em 


em 


















dazole 


r ase 


5 
10 


Function 




phosphate transport system 
regulatory protein 


phosphate-specific transport 
component 


phosphate ABC transport syst 
permease protein 


phosphate ABC transport sysl 
permease protein 


phosphate-binding protein 
precursor 


acetyltransferase 




hypothetical protein 


hypothetical protein 


branched-chain amino acid 
aminotransferase 


hypothetical protein 


hypothetical protein 


S'-phosphoribosyl-S-aminoimi 
synthetase 


amidophosphoribosyl transfei 
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Matched 
length 
(aa) 




CO 
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to 
to 

CM 


CM 

co 

CM 


m 

CM 
CO 
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CM 
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20 


Similarity 
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82.2 


78.5 


56.0 
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55.2 


74.2 
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79.0 
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Identity 
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46.5 
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51.4 
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40.0 
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24.7 


44.9 
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58.6 


81.0 
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OJ 

35 


Homologous gene 




Mycobacterium tuberculosis 
H37Rv Rv0821cphoY-2 


Pseudomonas aeruginosa pstB 


Mycobacterium tuberculosis 
H37Rv Rv0830 pstA1 


Mycobacterium tuberculosis 
H37Rv Rv0829 pstC2 


Mycobacterium tuberculosis 
H37Rv phoS2 


Streptomyces coelicolor A3{2) 
SCD84.18C 




Bacillus subtilis 168 bmrU 


Mycobacterium tuberculosis 
H37Rv RvOB13c 


Solanum tuberosum BCAT2 


Corynebacterium 
ammoniagenes ATCC 6872 
ORF4 


Mycobacterium tuberculosis 
H37RV Rv0810c 


Corynebacterium 
ammoniagenes ATCC 6872 
purM 


Corynebacterium 
ammoniagenes ATCC 6872 
purF 
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Initial 
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hypothetical protein 


hypothetical protein 


hypothetical membrane protein 


hypothetical protein 


S'-phosphoribosyl-N- 
formylglycinamidine synthetase 
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111 
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extracellular nuclease 
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C4-dicarboxylate transporter 


dipeptidyl aminopeptidase ] 


15 


Matched 
length 


CN 


m 

CO 


CN 


CM 


CO 
CO 

h- 




C CO 
C CM 
C CN 


CO 

r-- 




CO 

un 


in 

CO 
CO 




CM 




co 

CO 


20 


Similarity 
(%) 


75.8 


94.0 


87,1 


71.0 


89.5 




C CO 
C CO 
C CO 


93.7 




77.9 


51.5 




68.7 


CD 

CO 


70.6 




Identity 
(%> 


57.3 


75.9 


67.7 


64.0 


77.6 




C CO 

c o 

C CO 


81.0 




46.2 


28.0 




37.4 


49.0 


41.8 
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Mycobacterium tuberculosis 
H37Rv Rv0807 


Corynebactenum 
ammoniagenes ATCC 6872 
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Corynebactenum 
ammoniagenes ATCC 6872 
ORF1 


Sulfolobus solfataricus 


Corynebactenum 
ammoniagenes ATCC 6872 
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Corynebactenum 

ammoniagenes ATCC 6872 
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Corynebactenum 
ammoniagenes ATCC 6872 
purorf 
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Mycobacterium tuberculosis 
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Function 


pyruvate oxidase 


multidrug efflux protein 
transcriotional regulator 


hypothetical membrane protein 


3-ketosteroid dehydrogenase 


transcriptional regulator, LysK ramiiy 


hypothetical protein 



hypothetical protein 




hypothetical protein 


hypothetical membrane protein 


transcription initiation factor sigma 


trehalose-6-phosphate synthase 




trehalose-phosphatase 


glucose-resistance amylase 
regulator 


high-affinity zinc uptake system 
protein 


15 


Matched 
length 
(aa) 


r- 

ID 


S S 
S 




CO 

o 

CO 


CM 
CO 
CM 


CO 
CM 


CO 
CO 
CM 




o 


rr 
CD 
XT 


in 
m 


CO 
•V 




m 

CM 


CO 


CO 

tn 

CO 


20 


Similarity 
(%) 


75.8 


68.9 


78.4 


62.1 


O 

cri 

CD 


52.9 


55.6 




50.7 


64.0 


50.3 


66.7 




57.6 


60,2 


46,7 




Identity 
(%) 


46.3 


33.3 
^n 4 


45.6 


34.3 


37.1 


28.4 


26.7 




28.6 


36.0 


32.3 


38.8 




27.4 


247 


22.4 


25 

CJ 
C 

o 
o 

30 ~ 

QJ 
-O 

35 
40 


Homologous gene 


Escherichia coli K12poxB 


Staphylococcus aureus plasmid 
pSK23 qacB 


1 1 

M 0) 
- 

i 3 0 

r- O LU 

tn ^» CO 
U 5 X 


Rhodococcus erythropolis SQ1 
kstD1 


Bacillus subtilis 168 alsR 


Mycobacterium tuberculosis 
H37Rv Rv3298c IpqC 


Bacillus subtilis 168 ykrA 




Oryctolagus cuniculus kidney 
cortex rBAT 


Mycobacterium tuberculosis 
H37Rv Rv3737 


Streptomyces griseus hrdB 


Schizosaccharomyces pombe 
tps1 




Escherichia coli K12otsB 


Bacillus megaterium ccpA 


Haemophilus influenzae Rd 
HI0119znuA 


db Match 


o' 

CO 

CO 
X 

O 

CL 

o 
o 

LU 

CL 

cn 


prf:2212334B 


sp:YCDC_ECOLl 
pir.D70551 


™i 

cn 

CM 

o> 

CD 

cn 
o 

LL 

< 

ici. 
cn 


sp:ALSR BACSU 


pir:C70982 


pir:C69862 


u 


pir;A45264 


pir:B70798 


pir:S41307 


sp:TPS1_SCHPO 




sp:OTSB_ECOLI 


LU 

i 
<' 

CL 
O 

o 

a. 

«A 


UJ 
< 
X 

<• 

M 

cL 
tn 




ORF 
(bp) 


1737 


1482 


531 


2142 


m 
o 
r- 


CO 
CO 


CO 
CO 


• cn 
in 

p T 


cn 
cn 

CO 


1503 


r- 

CM 
CO 


1455 


CO 

in 


CO 
CO 

r- 


1074 


CM 

cn 


45 


Terminal 
(nt) 


2776768 


2780446 


2780969 
2782315 


2782340 
2784656 


2785651 


2788594 


2788587 ■ 


2789477 


2790550 


2792448 


2792857 


2794327 


2794812 


2795637 


2795676 


2797806 


50 


Initial 
(nt) 


2778504 


2778965 


2780439 
2760996 


2784481 
2785615 


in 
in 
c~ 
tc 

CC 

r*- 

CN 


2787782 


2789399 


2789935 


2790152 


2790946 


cn 

IT 

CN 

cr 
r- 
c\ 


2792873 


2794300 


2794870 


2796749 


in 

CD 

1 co 

CO 

i cn 

CM 




So ; 


6373 


6374 


6375 
6376 


6377 
6378 


a 
r* 
c 
CC 


6380 


6381 


*■ c\ 

> CC 

\ c- 

) CC 


6383 


6384 


6385 
6386 


r- 

CC 
CC 


CO 
CO 
CO 
CO 


6389 


6390 


55 


! O n < 

UJ u 
I S 2 c 


— r 

\ £ 


2874 


— \ 

m I co 

CO ' co 
CNI | CM 


2877 
2878 


a 

r- 
a 
r 


2880 


2881 


- | CN 

> CC 
5 CC 
J i O 


2683 


2884 


2885 
2886 


r- 
cc 

CC 

o 


■ ! ^ 

» 1 CC 
) ' CC 

T 


2889 


2690 
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Function 


ABC transporter 


hypothetical membrane protein 
transDOsase(ISA0963-5) 




3-ketosteroid dehydrogenase 




lipopolysaccharide biosyntnesis 
protein or oxidoreductase or 
dehydrogenase 


dehydrogenase or myo-mositol l- 
dehydrogenase 


c c 

CO) 

t "o 

I CL 

t-c 

( o 

C Q_ 
t 1/1 

f c 

< CO 
<• 

< CD 

t r* 
I t 

"i 

< tn 


shikimate transport protein 


transcriptional regulator 


ribosomal RNA nbose metnyiase or 
tRNA/rRNA methyltransferase 


cysteinyl-tRNA synthetase 


PTS system, enzyme II sucrose 
protein (sucrose-specific IIABC 
component) 


sucrose 6-phosphate hydrolase or 
sucrase 


glucosamine-6-phosphate 
isomerase 


N-acetylglucosamine-6-phospnaie 
deacetylase 


15 


Matched 
length 
(aa) 


CO 
CM 
CM 


in cn 

CO c 
c 




<o 

LO 




O 
CM 


CO 
CM 


C CM 

c cn 

C CM 


o 

CO 


CM 
CM 


CO 
CO 


CO 


CO 
CO 
CO 


CO 


GO 
CM 


CO 

CO 
CO 


20 


Similarity 
(%) 


63.2 


87.4 


> 

i 
> 


62.0 




56.4 


LO 

cn 
to 


U LO 

Cj to 


80.8 


55.7 


47.3 


68.8 


77.0 


56.9 


69.4 


60.3 




Identity 
(%) 


31.4 


60.0 

Ol it 


r 

:> 

si 


32.1 




34.3 


CM 

uo 

CO 


l in 
c o 

C CO 


43.1 


32.6 


22.8 


42.2 


47.0 


35.3 


38.3 


30.2 


25 

rj 
c 

c 
o 
o 

30 

X3 

35 


Homologous gene 


Staphylococcus aureus 8325-4 
mreA 


Mycobacterium tuberculosis 
H37Rv Rv2060 


ArcnaeogioDu* iuiyiuua 


Rhodococcus erythropotis SQ1 
kst01 




Thermologa maritima MSB8 
bplA 


Bacillus subtilis 168 idh or iolG 


« < 

J sz 

* </» 

C CM 

i 2 

" "c 

• o 

i CO 

"J 1c 

; O 

! cD 

o 
i in 
ljUJ 


Escherichia coiiK12shiA - 


Streptomyces coelicolor A3(2) 
SC5A7.19c 


Saccharomyces cerevisiae 
YOR201CPET56 


Escherichia coli K12cysS 


Lactococcus lactts sacB 


Clostridium acetobutylicum 
ATCC 824 scrB 


Escherichia coli K12nagB 


Vibrio furnissii SR1514 manD 


40 


db Match 


CM 

1 

CM 
CO 
CM 

< 
Ql 
cn 


pir:E70507 


CO 
CM 

cn 

3 

"ol 


w i 

cn 

CM 

cn 
to 
cn 
o 

Ll_ 

< 

Q. 

cn 




pir:B72359 


sp:MI20_BACSU 


sp:SHIA ECOLI 


sp;SHIA ECOLI 


CO 
r — 

1 

< 
in 
O 
in 

cn 


sp:PT56_YEAST 


sp.SYC ECOLI 


prf:2511335C 


gp:AF205034_4 


sp:NAGB_ECOLI 


ZD 
u. 
m 

5 

8 

CL 
%n 




SI 


o 

CO 

to 


LO 
LO 

to 


1500 
201 


1689 


r-~- 


CO 
CO 


LO 
CO 
TT 


» 
i 


LO 
LO 
CO 


to 

CM 


in 
to 


cn 

CO 
CO 


1380 


1983 


1299 


cn 
m 


1152 


45 


Terminal 
(nt) 


2798509 


2799391 


2801034 
2801313 


2801558 


2803250 


2804074 


2804676 




2805113 


2806016 


2806599 


2807426 


2808399 


2809824 


2811960 


2813279 


2814081 


50 


Initial 
(nt) 


2797820 


2798837 


m co 

CO 

LO t- 

cn t- 
cn c 
r- cc 

CM CN 


2803246 


to 
cr 

Ol 

c 

CC 
CN 


2804691 


2805110 




r- 
tc 

cn 

IT 

c 
cc 

CN 


Tj 

tc 
c 

CC 
1 CN 


2807252 


2808364 


2809778 


2811806 


2813258 


2814037 


2815232 




So : 


6391 


6392 


6393 


6395 


tc 
o 
c 
tr 




6398 




o 
a 
c 

CC 


) c 
) c 

) ' 

) tc 


1 

6401 


6402 


c 

c 

cc 


T 
O 

> to 


6405 


ID 
O 
^* 
CO 


6407 


55 


SEQ 
NO. 
(DNA) 

2891 


2892 


2893 

0(104 


2895 


i s 
i ? 


2897 


2898 




o 
o 
a 
c 


> c 
1 c 

3 O 

si r 


s *- 

5 , O 
1 cn 
g cm 


2902 


(*■ 
c 
o 

C> 


2904 


2905 


2906 


2907 
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a co 
cc 


cn 
c 
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Function 


dihydrodipicolinate synthase 


glucokinase 


N-acetylmannosamine-6-phospl 
epimerase 




sialidase precursor 


L-asparagine permease operon 
repressor 


dipeptide transporter protein or 
heme-binding protein 


dipeptide transport system 
permease protein 


n|innnftntjrlfi transnorj AIR-bind 
oligopeptide transport ATP-Bind 

protein 


oligopeptide transport ATP-bindi 
protein 


homoserine/homoserin lactone 
efflux protein or lysE type 
translocator 


leucine-responsive regulatory 
protein 




hypothetical protein 


hypothetical protein 


transcription factor 


15 


Matched 
length 
(aa) 


CO 
CO 
CM 


CM 
CO 


O 
CM 
CM 




CO 
CO 


Zll 


o 

CD 

in 


CM 
CO 


CO 


CO 

tn 

CM 


CO 
CO 


CM 




CM 

tn 


in 

CO 
CM 


in 












































CO 


CO 




CO 


CM 


rr 


CO 


CO 


r- 


r- 


CM 




CM 


in 




20 


GO 


CM 
CD 


r-.' 
to 


CO 
CD 




o 
to 


LO 


m 


CD 


CO 

r-- 


cd 

N- 


csi 

CO 


CD 
CD 




CO 
CO 


r- 


CO 






CM 




*T 




CO 


CO 


in 


CD 


in 




in 


O 




CO 


•«r 


CO 




IE 


CO* 
CM 


cb 

CM 


CD 
CO 




CM 


co' 

CM 


c\i 

CM 


CO 


co 
^* 


CO 


CO 

CM 


CO 




tri 
tn 


CD 


CO 

r- 


25 




































Si § 

Table 1 (continued) 


Homologous gene 


Escherichia coli K12 dapA 


Streptomyces coelicolor A3(2) 
SC6E10.20C glk 


Clostridium perfringens NCTC 
8798 nanE 




Micromonospora viridifaciens 
ATCC 31146 nadA 


Rhizobium etli ansR 


Bacillus firmus OF4 dppA 


Bacillus firmus OF4 dappB 


Bacillus subtilis 168 oppD 


Lactococcus lactis oppF 


Escherichia coli K12 rhtB 


Bradyrhizobium japonicum Irp 




Mycobacterium tuberculosis 
H37Rv Rv3581c 


Mycobacterium tuberculosis 
H37Rv Rv3582c 


Mycobacterium tuberculosis 
H37Rv Rv3583c 


40 


db Match 


DAPA_ECOLI 


GLK_STRCO 


2516292A 




> 
o 

i 

X 


»' 

CD 
T 

CO 

UL 
< 


BFU64514J 


DPPB_BACFI 


Z> * 
CO 

o 
< 

a< 

CL 
CL 

o 


5 

O 
u.' 

Q_ 
CL 
O 


RHTB_ECOLI 


2309303A 




C70607 


1— 

O 
>- 

I — ' 

CO 


H70B03 






CL 
tn 


OL 
tn 


nrf 




CL 

tn 


CL 

cn 


CL 
CD 


ds 


:ds 




:ds 


nrf 

pn. 




CL 


.Us 


CL 






CO 
CO 
CO 


CO 
O 
CO 


CO 
CO 
CO 




1215 


CO 
CM 
r-» 


1608 


in 

CD 


1068 


CO 
CO 


CM 

to 


CO 
CO 


o 

CO 
CO 


O 
CO 

rr 


CO 
CO 

r- 


CO 

in 


45 


Terminal 
(nt) 


2816393 


2817317 


2818058 


2818137 


2818350 


2819557 


2822191 


2823337 


2825341 


2826156 


2826215 


2827404 


2827458 


2827904 


2828379 


2829156 


50 


Initial 
(nt) 


2815458 


2816409 


2817363 


2818313 


2819564 


2820285 


2820584 


2822387 


2824274 


2825341 


2826835 


2826922 


2827817 


2828383 


2829146 


2829749 




SEQ 

NO 

(a.a.) 


6408 


CO 
O 
■*T 
CO 


6410 


6411 


6412 


6413 


CO 


6415 


6416 


6417 


6418 


6419 


6420 


6421 


6422 


6423 


55 


SEQ 
NO. 
(DNA) 


2908 


2909 


2910 i 


2911 | 


2912 


2913 


2914 


in 

CM 


2916 


2917 


2918 


2919 


2920 


2921 


2922 


2923 
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Function 


two-component system response 
regulator 


two-component system sensor 
histidine kinase 




DNA repair protein RadA 


hypothetical protein 


hypothetical protein 


p-hydroxybenzaldehyde 
dehydrogenase 




mitochondrial carbonate 
dehydratase beta 


A/G-specific adenine gtycosylase 






L-2.3-butanediol dehydrogenase 








hypothetical protein 


virulence factor 


virulence factor 


15 


Matched 
length 
(a.a) 


ro 
CM 

CM 


CO 




CO 
CD 

*r 


m 

CO 


CO 
CM 


^- 




O 
CM 


CO 
CO 
CM 






CO 

to 

CN 








r- 
cn 


cn 
cn 


CM 












































20 


Similari 
(%) 


o 

O 
r— 


67.7 




74.3 


73.3 


53.3 


85.1 




CM 

to 

CD 


70.7 






99.6 








69.1 


63.0 


55.0 




Identity 
(%) 


43.5 


29.3 




41.5 


ro 

O 


29.4 


59.5 




r- 
(d 

CO 


48.4 






99.2 








in 

CD 


57.0 


o 
tn 


Co Co Jo 
Oi O Oi 

Table 1 (continued) 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv3246c mtrA 


Escherichia coli K12baeS 




Escherichia coli K12radA 


Bacillus subtilis 168 yacK 


Mycobacterium tuberculosis 
H37Rv Rv3587c 


Pseudomonas putida NCIMB 
9866 plasmid pRA4000 




Chlamydomonas reinhardtii ca1 


Streptomyces antibioticus I MRU 
3720 mutY 






Brevibaderium saccharolyticum 








Mycobacterium tuberculosis 
H37Rv Rv3592 


Pseudomonas aeruginosa 
ORF24222 


Pseudomonas aeruginosa 
ORF25110 


40 


db Match 


prf:2214304A 


i 

o 
o 

LU 

1 

CO 
UJ 

< 

CL 

tn 




sp:RADA_ECOLI 


CO 

o 
< 

o 
< 
> 

CL 
tfl 


O 
CO 

o 
r— 
O 

CL 


«' 

CO 
CO 

CD 
cn 

□l 

cl 

"ei. 
cn 




pir:T08204 


gp:AF121797_1 






«' 

r- 
o 
cn 
o 
o 

CD 
< 

cn 








pir:E70552 


GSP:Y2918B 


GSP:Y29193 




Si 


CO 
CM 

r-» 


1116 


CM 
CO 

m 


1392 


1098 


co 

CO 


1452 




CM 
CD 


CO 

r- 

CO 


1155 


to 
o 

CO 


■*r 
r- 


XT 
CN 
CO 


r- 


CN 
CO 


cn 

CM 


o 

CM 


co ! 

CM 


45 


Terminal 
(nt) 


2830779 


2831694 


2832666 


2834181 


2835285 


2835283 


2836048 


2837591 


2837956 


2839521 


2840716 


2840758 


2841848 


2842453 


2843233 


2843716 


2843432 


2845558 


2846101 


50 


Initial 
(nt) 


2830057 


2830779 


2832085 


2832790 


2834188 


2835969 


2837499 


2837737 


2838576 


2838643 


2839562 


2841063 


2841075 


2842130 


2842493 


m cm 

O 1 CM 

-«t ; 

CO 1 CO 

CO • CO 
CN j CN 


2845139 


2845889 




SEQ 
NO. 
(a a.) 


6424 


6425 


6426 


r*- 
CM 

• CD 


6428 


6429 


6430 


6431 


I 

6432 


6433 


6434 


6435 


6436 


6437 


64381 


6439 
6440 


6441 


6442 


55 


SEQ 

NO. 
(DNA) 


2924 


2925 


1 CD 
1 ™ 

cn 

1 CM 
L. .. 


2927 


2928 


2929 


2930 


2931 


2932 


2933 


2934 


2935 


2936 


r-- co 
co : co 
cn • cn 

CM CN 


2939 


2940 


2941 


2942 
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.CD 



Function 


virulence factor 

1 


CIpC adenosine triphosphatase / 
ATP-binding proteinase 


inosine monophosphate 
dehydrogenase 


transcription factor 


phenol 2-monooxygenase 










cc 

CO 
CO 
CQ. 

ao> 

£S 

a ro 

v <s> 
a cu 
k t_ 

c c 
to 

EE 
c o 

C CJ 

c c 


hypothetical protein 


lysyl-tRNA synthetase 


pantoate-beta-alanine ligase 






hypothetical membrane protein 


2-amino-4-hydroxy-6- 

hydroxymethyldihydropteridine 

pyrophosphokinase 


dihydroneopterin aldolase 


dihydropteroate synthase 


Matched 
length 
(a.a) 


LO 


CM 
CO 
CO 


CO 
CO 
IT 


CO 
CO 


o 

CO 
CD 










a ao 


O 
CM 


LO 


GO 
CO 
CM 






CO 
CO 


CO 

m 

T— 


CO 


CO 
CO 
CM 


Similarity 
(%) 


75.0 


CM 

CO 
CO 


70.2 


62.7 


60.9 










CO 
CO 
CO 
t -r— 


55.8 


71.2 


52.6 






69.6 


69.0 


69.5 


75.0 


Identity 
(%) 


74.0 


58.5 


37.1 


24.7 


33.5 










CO 
c o 
c o 


26.7 


41.7 


29.9 






29.0 


42.4 


38.1 


51.5 


Homologous gene 


Pseudomonas aeruginosa 
ORF25110 


Bacillus subtili*. 168 mecB 


Bacillus cereus ts-4 impdh 


Rhodococcus rhodochrous nitR 


Trichosporon cutaneum ATCC 
46490 










Corynebacterium glutamicum 
coryneDactenum giuiamtcum 

ImrB 


Mycobacterium tuberculosis 
H37Rv Rv3517 


Bacillus stearothermophilus lysS 


Corynebacterium gtutamicum 
ATCC 13032 panC 






Mycobacterium leprae 
MLCB2548.04C 


Methylobacterium extorquens 
AM1 folK 


Bacillus sublilis 168 folB 


Mycobacterium leprae folP 


db Match 


GSP:Y29193 


CO 

o 
< 

CO 

1 

m 

O 

UJ 

CL 

Ui 


gp:AB035643J 


pir:JC6117 


3 
O 

or 
i— 

s 1 

CM 

X 
CL 

CL 
(/) 










' *~i 

r r-~ 

< CO 
C CO 

r i*- 

C CO 
C CM 
I LL. 
' < 
CL 

cn 


pir:G70807 


gp:AB012100_1 


CM 
< 

a. 

a 
a 

CL 

cn 






CO 
TT 

to 

CM 
CD 
O 
_J 

cL 
cn 


X 
LU 
l~ 
LU 

i 

a 
a 

X 
cL 


sp:FOLB_BACSU 


LO 

CO 
CO 

s 

CD 

< 

CL 

cn 




CM 
CO 


2775 


1431 


1011 


1785 


1716 


1941 


1722 


CM 
CO 
r— 


1443 


LO 

CO 


1578 


CO 

a> 
f- 


CO 

co 
<o 


CO 
CO 
r- 


LO 
CO 


r-»- 

, cr 


o 

CO 


837 


Terminal 
(nt) 


2846506 


2844166 


2848659 


2849779 


2851815 


2853732 


2855709 


2857516 


2859205 


2857613 


2859195 


2860505 


2862132 


2862929 


2863624 


2864384 


2864867 


2865346 


2865731 


Initial 
(nt) 


2846186 


2846940 


2847229 


2848769 


2850031 


2852017 


2853769 


2855795 


2859044 


2859055 


2860145 


2862082 


2862929 


2863621 


2864421 


2864848 


2865343 


2865735 


| 2866567 


n • — ^ 1 co 

2° 3 ? 


CO 


6445 


6446 
6447 
6448 


6449 


6450 


6451 


6452 


6453 


6454 


6455 


6456 


6457 
6458 


6459 


16460 


6461 


2g| 

co 2 Q 


-r 

CO 

cn 

CM 


2944 


2945 


2946 
2947 
"2948 


[~2949 


o 

LO 

cn 

CM 

1 


2951 


2952 


2953 


2954 


2955 


2956 


2957 


2959 


o 

CO 
CO 
CM 


2961 



EP 1 108 790 A2 



5 
10 


Function 


GTP cyclohydrolase 1 




cell division protein FtsH 


hypoxanthine 
phosphoribosyltransferase 


cell cycle protein MesJ or cytosine 
deaminase-related protein 


D-alanyl-D-alanine 
carboxypeptidase 


inorganic pyrophosphatase 




spermidine synthase 


hypothetical membrane protein 
nypomeiicai rnemuiane pi mem 


hypothetical protein 


hypothetical protein 


hypothetical protein 


PTS system, beta-gtucosides- 
permease II ABC component 




ferredoxin reductase 


hypothetical protein 


bacterial regulatory protein, marR 
family 


15 


Matched 
length 
(a.a) 


CO 
CO 

T— 




CM 
CO 

r- 


LO 
CO 


o 

CO 


o> 
to 


cn 

to 




r— 
o 
to 


CNN 

cor> 

T' 




CO 

r- 


CM 
O 
CM 


CO 
CO 






cn 


to 

CO 


20 


Similarity 
(%) 


86.2 




o 

oS 

CO 


83.0 


66.8 


51.4 


73.6 




80.7 


crib 
coco 


63.2 


60.1 


72.3 


59.6 




69.6 


73.2 


59.3 




Identity 
(%) 
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transcriptional regulator 








hypothetical protein 
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phosphodiesterase 


gluconate permease 
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L-lactate dehydrogenase 
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L-lactate dehydrogenase or FMN- 
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immunity repressor protein 
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transcriptase (RNA-dependent) 
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noduHn 21 -related protein 
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hypothetical protein 


transposase 


transposase protein fragment 
TnpNC 




glyceraldehyde-3-phosphate 
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transcriptional regulatory protein 
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dehydrogenase or streptomycin 
biosynthesis protein 


phosphoesterase 








stomatin 




DEAD box RNA helicase family 


hypothetical membrane protein 




phosphomethylpyrimidine kinase 


mercuric ion-binding protein or 
heavy-metal-associated domain 
containing protein 


ectoine/proline uptake protein 


Matched 
length 
(aa) 


tn 

CO 


CO 

tn 


0 

CD 
CM 


in 

CO 


0 

r- 

CN 


CM 
CO 
CO 


CO 

^" 

CO 


1242 








8 

CM 




1660 






tn 

CM 


CD 


CD 

CM 


Similarity 
(%) 


75.5 


58.3 


r»- 

0 
CD 


r*- 
tn 
m 


58.2 


59.6 


62.4 


!■»- 
CM 
CD 








57.3 




CM 
O 
CO 


0 

CD 




76.8 


70.1 


62.3 


Identity 
(%) 


43.0 


31.4 


r*- 
ih 

CM 


27.2 


25.9 


26.5 


34.1 


33.3 








.28.6 




58.4 


34.8 




50.4 


CO 
CD 


29.9 


Homologous gene 


Pseudomonas sp. P51 


Escherichia coli K12 xylE 


Salmonella typhimurium icIR 


Escherichia coli K12 ydgJ 


Listeria innocua strain 4450 


Sinorhizobium meliloli idhA 


Streptomyces griseus strl 


Bacillus subtilis yvnB 








Caenorhabditis elegans unci 




Mycobacterium bovis BCG 
RvD1-Rv2024c 


Mycobacterium leprae u 2 266k 




Bacillus subtilis thiO 


Bacillus subtilis yvgY 


Corynebacterium glutamicum 
proP 


db Match 


0 

cn 

UJ 

to 

CL 

I 

u_ 

CO 

u 
y— 

bl 
cn 


— j 
O 
0 

UJ 

UJ 
— 1 

> 

X 

oL 
tn 


> 
t- 

1 

< 

CO 

or 
_> 
0 

cL 


sp:YDGJ_ECOLI 


gsp:W61761 


cn 

CO 

o 1 

CM 

2 
cL 

CO 


sp;STRI_STRGR 


TT 
TT 
O 
O 

r*- 
O 

5> 








sp:UNC1_CAEEL 




cn , 

in 
0 

CD 
CO 
r— 
O 
£D 

CL 
CO 


CO 
CD 
CO 
CO 
CM 
CO 
CM 

t: 
0. 




ZD 

to 
m 

1 

0 

X 

*— 

CL 
CO 


pir;F70041 


< 

in 

CJ) 
CM 
v— 
O 

tn 

CM 

tz 
ex. 




1089 


1524 


5 

CD 


1077 


CD 

co 


1005 


1083 


4032 


m 

CO 


CO 
CD 


1086 


*r 

r- 


CD 
CO 

o> 


4929 


r*- 
0 
tn 


0 

CO 
CO 


O 
O 
CO 


CO 
CM 


r- 

CO 
CO 


Terminal 
(nt) 


3257403 


3258561 


3261989 


3263221 


3264115 


3265146 


3266266 

i 
i 


3271093 


3267913 


3268618 


3272477 


3274488 


3275602 


3276671 


3281666 


3283101 


3282347 


3283383 


3283473 


Initial 
(nt) 


3258491 


3260084 


3261129 


3262145 


3263237 


3264142 


co 

in 
to 

CM 
CO 


3267062 


3268557 


3269235 


3271392 


3275231 


3276570 


3281599 


3282172 


3282742 


3282946 


3283141 


3284309 


(73 2 «, 


6881 


6882 


6883 


6884 


6885 


6886 


6887 


6888 


6889 


6890 


68911 


6892 


6893 
6894 


tn 

CD 
CO 
CD 


6896 


6897 


6898 


6899 < 


w 2 £ 


\33&U 


3382 


3383 


3384 


3385 


3386 


3367 


3388 


3389 


3390 


a> 

CO 
CO 


CM 
CD 
CO 

I" 


I 3393 


3394 


3395 


(33961 


3397 


3398 


3399 
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c 

3 
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15 



*5) <A 



20 



25 
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(J 
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£ 
o 
X 
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40 



2 
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Ew -- 
QUO 
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o 
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d 

CO 



CM 



CN 

5 
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a 



m 

a 



c cr 
^ X 
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CO c 5 

(A O 

— O W eft 
CO C CO CO 

1.1 fl 

— c: qj x 

Q.-Q O 



o 

CO 

in 



E 
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a. 



co E 



O 
CL 
X 

o 
co 



or 
2 



45 



e ; 



in 
cn 



cn 
cn 
co 

co 

CN 

cn 



un 

m 

CO 

in 



50 



So ^ 



55 



2 9i 

W 2 Q 



o 
o 
cn 
co 



in 
to 

CO 



in 
m 



o 
cn 
id 



o 
o 



ll 

5! -6 
2 -o 
a. v 

CD « 

C O 

— o 
en * 
vi . 
co 



c 

jo 

is 

o 



15 cl 



c3 ro i 



CD 



cn 

5. 



O 
CO 

m 



o 
o 



m 
co 
co 
co 
co 

CM 
ro 



m 
co 

CD 



o 
cn 

CD 



CD 

o 



LU 



CO 

o o 



c 
o 

CL 

I- 

-s 

E 
o 

E 



5 § 

in 2 



CO 
CM 



CD 
CO 



E 00 
o a: 

^CO 

2 X 



CO 

o 

LU 



CD 
CD 
cn 



CM 

co 
in 

CM 

cn 

CM 

co 



cn 

CD 



2 

Q. 
0) 

c 
ro 

n 

E 



CO 

in 
co 



in 

CM 



c cn 

Is 

_Q > 

O C£ 
o r- 
co 
2 X 



o 

CL 

a) 
c 

(0 
J3 

E 

a* 

E 



o 

CL 



o 

CM 



O 
CD 



in 



E o 

5 > 
o cr 

>*co 
2 X 



CD 

O 



in 

CM 



O 
O 
CD 

cn 

CM 
CO 



cn 

CM 
CO 



o 
o 

CD 

o 
r-- 

o 



CM 
CO 



cn 
cn 

CM 



cn 
co 



cn 

CD 



jo t 
J5 
X x» 

« uj 

CO O 

E£ 

O r*- 



O) 
CO 



cn 
o 

CD 



3 
cn 

CO 



QJ 
CO 



o 
E 
o 



LU 

co 

°-. 

X 
CO 
0- 

cr 



co 



o 

CD 



CD 
CM 
O 
O 
CO 



ID 
CD 

cn 



cn 
CD 



CO 

o 
o 



CO I CO I 
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CD 






















Function 




thioredoxin ch2, M-type 


N-acetylmuramoyl-L-alanine 
amidase 






hypothetical protein 


hypothetical protein 


partitioning or sporulation protein 


glucose inhibited division protein 


hypothetical membrane protein 


ribonuclease P protein componer 


50S ribosomal protein L34 






L-aspartate-alpha-decarboxylase 
precursor 


2-isopropylmalate synthase 


hypothetical protein 


aspartate-semiatdehyde 
dehydrogenase 


3-dehydroquinase 


Matched 
length 
(aa) 




CO 


CO 

cn 






CM 
CM 


CO 
CO 


CN 
r— 
CN 


CO 

m 


CO 
CO 


CO 
CN 


xr 






CO 
CO 


CO 
CO 


LO 
CO 


XT 

xr 

CO 


CO 
xr 








































Similarity 
(%) 




76.5 


75.4 






56.5 


60.5 


78.0 


64.7 


75.4 


59.4 


93.6 






100.0 


0*001 


100.0 


100.0 


100.0 


Identity 
(%) 




42.0 


51.0 






34.4 


37.6 


65.0 


36.0 


44.7 


26.8 


83.0 






100.0 


0*001 


100.0 


100.0 

i 


100.0 


Homologous gene 




Chtamydomonas reinhardtii thi2 


Bacillus subtilis cwlB 






Mycobacterium tuberculosis 
H37Rv Rv3916c 


Pseudomonas putida ygi2 


Mycobacterium tuberculosis 
H37Rv parB 


Escherichia coli K12gidB 


Mycobacterium tuberculosis 
H37Rv Rv3921c 


Bacillus subtilis rnpA 


Mycobacterium avium rpmH 






Corynebacterium glutamicum 
panD 


Corynebacterium glutamicum 
ATCC 13032 leuA 


Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
13032 orfX 


Corynebacterium glutamicum 
asd 


Corynebacterium glutamicum 
AS019aroD 


db Match 




LU 

rr 
_j 

o 

X 

r- 


in 
o 
< 

CQ 

m 1 

_J 

B 






D70851 


z> 
a 

LU 

c/> 

Q. 

CM 

O 
> 


r> 

CL 
LU 
CO 
CL 


—i 
o 
u 

LU 

m ' 

o 

o 


A70852 


cn 
U 
< 
CO 

<■ 

z 

a: 


MAU19165J 






AF116184J 


— i 
o 
cn 
O 
o 

i 

r— 

3 
LU 
—l 


-j 
0 
tr 
O 

°. 

LU 
-J 
> 


DHAS_CORGL 


AF124518J 






cL 

tO 


Cl 
co 






pir: 


Cl 
tn 


cL 

in 


Cl 

CO 


. Cl 


CL 
CO 


Cl 

cn 






CL 
CD 


CL 

cn 


Cl 
to 


Cl 
to 


CL 

cn 


ORF 
(bp) 


1 1851 


CM 
CO 


1242 


LLL 


1041 


CO 

s 


1152 


CO 
CO 


CD 
CO 
CO 


tn 

cn 


CO 

cn 

CO 


CO 
CO 
CO 


cn 

CM 


CM 
CM 
CN 


CO 
O 
xr 


1848 


in 
in 

CN 


1032 


r- 

XT 
XT 


Terminal 
(nt) 


3300119 


330172g 


3302996 


330198g 


3304475 


33029gg 


3303636 


3304835 


3305864 


3306682 


3307971 


3308412 


3309321 


3308822 




147573 


266154 i 


268814 


271691 


446521 


Initial 
(nt) 


3301303 


3301358 


3301755 


3302765 


3303435 


3303616 


3304787 


3305671 


3306532 


3307632 


3308369 


3308747 


3309028 


3309043 


147980 


268001 


269068 


270660 


446075 


SEQ 

NO 

(a.a.) 


6gie1 


6gig 


6920 


6921 


6922 


6923 


6924 


6925 


6926 


6927 


6928 


6929 


6930 


693l| 


6932 


6933 


6934 


6935 


6936 


SEQ 
NO. 
(DNA) 


3418 


3419 


3420 


3421 


3422 


3423 


3424 


3425 


3426 


3427 


CO 
CN 
XT 
CO 


3429 


o 

CO 
XT 
CO 


3431 


3432 


3433] 


3434 


3435 


3436 
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c 



Function 


elongation factor Tu 


preprotein translocase secY subuit 


isocitrate dehydrogenase 
(oxalosuccinatedecarboxylase) 


acyl-CoA carboxylase or biotin- 
binding protein 


citrate synthase 


putative binding protein or peptidyl- 
prolyl cis-trans isomerase 


glycine betaine transporter 


hypothetical membrane protein 


L-lysine permease 


aromatic amino acid permease 


hypothetical protein 


succinyl diaminopimelate 
desuccinytase 


proline transport system 


arginyMRNA synthetase 


Matched 
length 
(aa) 


CO 

cn 

CO 


o 


CD 
CO 

r- 


O) 

to 


CO 


CO 


in 

O) 

m 


CO 
CM 
*T 


o 
in 


CO 
CO 


CO 
CO 


cn 

CO 
CO 


CM 

tn 


o 
to 
tn 


Similarity 
(%) 


I 100.0 


O 

o 
o 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


0001 


100.0 


100.0 


100.0 


100.0 


100.0 


Identity 
(%) 


100.0 


o 
o 
o 


100.0 


0*001 


100.0 


100.0 


100.0 


100.0 


o 
o 
o 


o 

o 
o 


100.0 


100.0 


o 
o 
o 


100.0 


Homologous gene 


Corynebacterium glutamicum 
ATCC 13059 tuf 


Corynebacterium glutamicum 
(Brevibacterium flavum) MJ233 
secY 


Corynebacterium glutamicum 
ATCC 13032 icd 


Corynebacterium glutamicum 
ATCC 13032 accBC 


Corynebacterium glutamicum 
ATCC 13032 gltA 


Corynebacterium glutamicum 
ATCC 13032 fkbA 


Corynebacterium glutamicum 
ATCC 13032 betP 


Corynebacterium glutamicum 
ATCC 13032 orf2 


Corynebacterium glutamicum 
ATCC 13032 lysl 


Corynebacterium glutamicum 
ATCC 13032 aroP 


Corynebacterium glutamicum 
ATCC 13032 orf3 


Corynebacterium glutamicum 
ATCC 13032dapE 


Corynebacterium glutamicum 
ATCC 13032 putP 


Corynebacterium glutamicum 
AS019 ATCC 13059 argS 


db Match 


sp:EFTU_CORGL 


_i 

o 

cc 
O 

a 

>' 
a 

UJ 

co 

CL 

in 


_j 
O 
rr 
o 

°. 

X 

g 

CL 
VI 


prf:2223173A 


i 

o 
rr 
O 
u 

>-' 

co 
o 

CL 


sp:FKBP_CORGL 


_f 
o 
cc 
O 

UJ 
CO 
cL 

(A 


_j 

e> 
tr 
o 
o 

1 

CM 

_J 

> 

CL 
</) 


_i 

CD 
CC 

o 

°. 

CO 

> 
_i 

cL 

to 


_j 
C3 
CC 
O 
a 

o. 1 
O 

$ 

CL 

tn 


pir:S52753 


prf:2106301A 


gp:CGPUTP_1 


O 
cn 
o 

°, 
cn 
>- 
co 

CL 

</» 


ORF 
(bp) 


1188 


1320 


r ■ 
2214 


1773 


1311 


tn 

CO 


1785 


1278 


1503 


1389 


948 


1107 


1572 


1650 


Terminal 
(nt). 


527563 


570771 


677831 


718580 


i 879148 


B79629 


946780 


1029006 


1030369 


1153295 


1154729 


1156837 


1218031 


1239923 


Initial 
| (nt) 


i 

526376 


569452 


680044 


720352 


i 877838 


879276 


CO 

cn 
cn 

cn 


1030263 


1031871 


1154683 


1155676 


1155731 


1219602 


1238274 


SEQ 

NO. 
(aa.) 


6937 


6938 


6939 


6940 


6941 


6942 


6943 


cn 

CO 


: m 
cn 

CO 


6946 


6947 


6948 


cn 

s cn 
I <° 


6950 


SEQ 
NO. 
(DNA) 


3437 


3438 


3439 


o 

CO 


CO 


CM 
CO 


CO 

ro 


CO 


m 

CO 


CO 
CO 


r- 

• co 


CO 

ro 


cn 

CO 


o 
to 

CO 
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5 
10 


Function 


diaminopimelate (DAP) 
decarboxylase (meso- 
diaminopimelate decarboxylase) 


homoserine dehydrogenase 


homoserine kinase 


ion channel subunit 


lysine exporter protein 


lysine export regulator protein 


acetohydroxy acid synthase, large 
subunit 


acetohydroxy acid synthase, small 
subunit 


acetohydroxy acid isomeroreductase 


3-isopropylmalate dehydrogenase 


PTS system, phosphoenolpyruvate 
sugar phosphotransferase 
(mannose and glucose transport) 


acetylglutamate kinase 


ornithine carbamoyltransferase 


arginine repressor 


15 


Matched 
length 
(aa) 


m 


to 


cn 
o 

CO 


CO 
CM 


CO 
CO 
CM 


o 
o> 

CM 


CO 
CM 

to 


CM 


CO 
CO 
CO 


o 

XT 

CO 


CO 
CO 
CO 


O 
OJ 


CD 
CO 




20 


Similarity 
(%) 


100.0 


100.0 


100.0 


100.0 | 


O 
O 

o 


100.0 


o 

o 
o 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 




Identity 

(%) 


0*001 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


o 
o 
o 


100.0 

I 


o 
o 
o 


100.0 


100.0 


100.0 


100.0 


25 ^ 

CD 
C 
C 

o 

T — 

30 oj 


Homologous gene 


Corynebacterium glutamicum 
AS019 ATCC 13059 lysA 


Corynebacterium glutamicum i 
AS019 ATCC 13059 horn 


Corynebacterium glutamicum 
AS01 9 ATCC 13059 thrB 


Corynebacterium glutamicum 
R127orf3 


Corynebacterium glutamicum 
R127 lysE 


Corynebacterium glutamicum 
R127lysG 


Corynebacterium glutamicum 
ATCC 13032 ilvB 


Corynebacterium glutamicum 
ATCC 13032 ilvN 


Corynebacterium glutamicum 
ATCC 13032 ilvC 


Corynebacterium glutamicum 
ATCC 13032leuB 


Corynebacterium glutamicum 
KCTC1445 ptsM 


Corynebacterium glutamicum 
ATCC 13032 argB 


Corynebacterium glutamicum 
ATCC 13032 argF 


Corynebacterium glutamicum 
AS019argR 


35 
40 


db Match 


sp:DCDA_CORGL 


— i 
CD 
or 
O 
o 

o 

X 
Q 

CL 
tn 


sp:KHSE_CORGL 


gsp:W37716 


_j 
CD 
or 
O 

CJ 

uj' 

CO 

CL 
</» 


—i 

CD 

or 
o 

CO 

> 

CL 

tn 


—i 
CD 
or 
o 

s 

> 

CL 

to 


CO 

"V 

CO 
CO 

m 

CL 


pir:C48648 


_j 

CD 

tr 
o 

3 

Z> 
UJ 
_J 

CL 
*A 


prf:2014259A 


sp;ARGB_CORGL 


—j 
CD 
or 
o 

s' 

o 

iaL 
tn 


gp:AF041436J 




ORF 


1335 


1335 


OJ 

o> 


CM 
CO 


CO 
O 

r- 


O 
r-- 

CO 


1878 


CO 


1014 


1020 


2049 


CM 
CO 
CO 


in 

CD 


CO 
^— 
LO 


45 


Terminal 
(nt) 


1241263 


1243841 


1244781 


1328243 


1328246 


1329884 


1340008 


1340540 


1341737 


1354508 


in 

CO 
CM 

m 

CM 
XT 


1467372 


1469521 


1470040 


50 


Initial 
(nt) 


1239929 


1242507 


1243855 


1327617 


1328953 


1329015 


1338131 


1340025 


1340724 


1353489 


1423217 


1466491 


1468565 


1469528 






6951 


6952 


6953 


6954 


6955 


6956 


6957 


6958 


6959 


0969 


6961 


6962 


6963 


6964 


55 


SEQ 
NO. 
(DNA) 


3451 


3452 


3453 


3454 


3455 


3456 


3457 


3458 


3459 


3460 


CO 
XT 
CO 


3462 


3463 


3464 ! 
I 
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5 
10 


Function 


NADH dehydrogenase 


phosphoribosyl-ATP- 
pyrophosphohydrolase 


ornithine-cyclodecarboxylase 


ammonium uptake protein, high 
affinity 


protein-export membrane protein 
secG 


phosphoenolpyruvate carboxylase 


chorismate synthase (5- 
enolpyruvylshikimate-3-phosphate 

phospholyase) 


restriction endonuclease 


sigma factor or RNA polymerase 
transcription factor 


glutamate-binding protein 


recA protein 


dihydrodipicolinate synthase 


dihydrodipicolinate reductase 


L-malate dehydrogenase (acceptor) 


15 


Matched 
length 
(aa) 


r- 

CD 
T 


CO 


CN 
CD 
CO 


CN 

tn 


r- 


cn 

CO 


o 


CN 
CO 
CD 


CO 
CO 


LO 

o> 

CN 


fe 

CO 


o 

CO 


CO 
CN 


o 
o 
to 


20 


Similarity 
(%) 


o 
o 
o 


100.0 


100.0 


100.0 


100.0 


100.0 ; 


o 

o 
o 


100.0 


100.0 


100.0 


100.0 

, 1 


100.0 . 


100.0 s 


100.0 




Identity 
(%) 


100.0 


o 
o 
o 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


o 
o 
o 


100.0 


100.0 


100.0 


100.0 


100.0 


25 

CD 
C 

O 

30 y- 

0J 
.Q 

35 
40 


Homologous gene 


Corynebacterium glutamicum 
ATCC 13032 ndh 


Corynebacterium glutamicum 
AS019hisE 


Corynebacterium glutamicum 
ATCC 13032 ocd 


Corynebacterium glutamicum 
ATCC 13032 amt 


Corynebacterium glutamicum 
ATCC 13032 secG 


Corynebacterium glutamicum 
ATCC 13032 ppc 


Corynebacterium glutamicum 
AS019aroC 


Corynebacterium glutamicum 
ATCC 13032 cglllR 


Corynebacterium glutamicum 
ATCC 13869 sigB 


Corynebacterium glutamicum 
ATCC 13032 gluB 


Corynebacterium glutamicum 
AS019recA 


Corynebacterium glutamicum t 
(Brevibacterium lactofermentum) 
ATCC 13869 dapA 


Corynebacterium glutamicum 
(Brevibacterium lactofermentum) 
ATCC 13869 dapB 


Corynebacterium glutamicum 
R127 mqo 


db Match 


o' 
to 

CN 
CO 
CO 
CN 
— 1 

O 

o 

CL 

cn 


o 

CD 
CO 

o 
u. 

< 

CL 
CO 


CO 

r— 

o 
o 
—1 
CD 

CJ 
CL 

cn 


CO 

cn' 

CO 
r-- 
r-. 
O 

o 
_J 
CD 
o 

CL 
CD 


N i 

CN 
CO 

r-. 
O 
O 
_J 

CD 

o 

CL 

cn 


prf:1509267A 


*~ 

o< 

o 

CD 
CN 

LL 
< 
CL 

cn 


pir:B55225 


prf;2204286D 


_j 
CD 
cn 
o 
o 

CQ 

—J 
O 

CL 
CO 


_j 

o 
or 
O 
o 

<■ 

o 
at 
or 

CL 
«A 


LU 

or 

! 

Q 

CL 

<n 


sp:DAPB_C0RGL 


i 

CD 

rr 

CO 

-*r 

CM 

CD 
O 

CL 

cn 






1401 


CD 
CN 


1086 


1356 


CO 

CN 


2757 


1230 


1896 


CO 

cn 

CO 


tn 

CO 
CO 


1128 


CO 

O 
CO 


*r 


1500 


45 


Terminal 
(nt) 


1543154 


1586465 


1674123 


1675268 


1677049 


1677387 


6996 U I 


1882385 


2021846 


2061504 


2063989 


2079281 


2081191 


2113864 ' 


50 


Initial 
(nt) 


1544554 


1586725 


1675208 


1676623 


1677279 


1680143 


1720898 


o 
cn 
rr 
O 
oo 
co 


2020854 


2060620 


2065116 


2080183 


2081934 


21 15363 ' 




SEQ 
NO. 
(aa.) 


6965 


6966 


6967 


6968 


_ — 
6969 


6970 


6971 


6972 


6973 


6974 


6975 


6976 


6977 


6978 


55 


SEQ 
NO. 
(ONA) 


3465 


3466 


3467 


3468 


3469 


3470 


3471 


3472 


3473I 


3474 


3475 


3476 : 


3477 


3478 
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5 

lU 


Function 


uridilylyltransferase. uridilylyl- 
removing enzyme 


nitrogen regulatory protein P-ll 


ammonium transporter 


glutamate dehydrogenase (NADP+) 


pyruvate kinase 


glucokinase 


glutamine synthetase 


threonine synthase 


ectoine/proline/glycine betaine 
carrier 


malate synthase 


isocitrate lyase 


glutamate 5-kinase 


cystathionine gamma-synthase 


ribonucleotide reductase 


glutaredoxin 


15 


Matched 
length 
(a.a) 


CN 
CO 
CD 


CN 


CO 
CO 
-<T 


r- 


m 

r- 


CO 
CN 
CO 




CO 


m 
£ 


cn 

CO 


CN 
CO 

■a 


CO 
CO 
CO 


386 


CO 




20 


Similarity 
(%) 


o 

o 
o 


100.0 


o 
o 
o 


100.0 

,. I 


100.0 


100.0 


100.0 


100.0 


100.0 


0*001 


o 

o 
o 


100.0 


100.0 


100.0 


100.0 




Identity 
(%) 


o 
o 
o 


100.0 


100.0 


100.0 


100.0 


100.0 


o 
o 
o 


o 
o 
o 


100.0 


100.0 ! 


0*001- 


000V 


100.0 ; 

i 


100.0 


100.0 


25 ^ 

OJ 
O 

30 o 


Homologous gene 


Corynebacterium glutamicum 
ATCC 13032 glnD 


Corynebacterium glutamicum 
ATCC 13032 glnB 


Corynebacterium glutamicum 
ATCC 13032 amtP 


Corynebacterium glutamicum 
ATCC 17965 gdhA 


Corynebacterium glutamicum 
AS019pyk 


Corynebacterium glutamicum 
ATCC 13032 glk 


Corynebacterium glutamicum 
ATCC 13032 glnA 


Corynebacterium glutamicum 
thrC 


Corynebacterium glutamicum 
ATCC 13032 ectP 


Corynebacterium glutamicum 
ATCC 13032 aceB 


Corynebacterium glutamicum 
ATCC 13032 aceA 


Corynebacterium glutamicum 
ATCC 17965 proB 


Corynebacterium glutamicum 
AS019metB 


Corynebacterium glutamicum 
ATCC 13032 nrdl 


Corynebacterium glutamicum 
ATCC 13032 nrdH 


35 
40 


db Match 


gp:CAJ10319_4 


gp;CAJ10319_3 


gp:CAJ10319_2 


pir:S32227 


~j 

e> 

DC 

o 
u 

> 

CL 
*T 
CL 

<n 


i 

o 

CO 
CN 
CO 

cn 
o 
ll. 
< 

CL 

cn 


prf: 2322244 A 


sp:THRC_CORGL 


prf:2501295B 


m 

r- 
o 

'5. 


CO 

o 


sp:PROB_CORGL 


gp;AF126953J 


gp:AF112535_2 


m 

CO 

cn 
CN 

LL 
< 

cn 




si 


2076 


to 

CO 
CO 


1314 


1341 


1425 


cn 

CO 

cn 


1431 


1443 


1845 


2217 


1296 


1107 


1158 




CO 
CN 


45 


Terminal 
(nt) 


2169666 


2171751 


2172154 


2194742 


2205668 


2316582 


2350259 


2353600 


2448328 


2467925 


2472035 


2496670 


2590312 


2679684 


2680419 


50 


Initial 
(nt) 


2171741 


2172086 


2173467 


2196082 


2207092 


2317550 


2348829 


2355042 


2450172 


2470141 


2470740 


2497776 


2591469 


2680127 


2680649 




SEQ 

NO 

(aa.) 


1 6979 

i 


6980 


|6981 


CN 
CD 

cn 

ID 


6983 


6984 


6985 


6986 


6987 


6986 


6989 


0669 


6991 


6992 


6993 


55 


SEQ 
NO. 
(DNA) 


3479 


3480 


3481 


3482 


CO 
CO 

co 


co 

CO 


m 

CO 
TT 
CO 


3486 


3487 


3468 


! 

3489 


3490 


CD 
CO 


3492 


I CO 

« cn 

I * 
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c 



03 



Function 


meso-diaminopimelate D- 
dehydrogenase 


porin or cell wall channel forming 
protein 


acetate kinase 


phosphate acetyltransferase 


multidrug resistance protein or 
macrolide-efflux pump or 
drug: proton anttporter 


ATP-dependent protease regulatory 
subunit 


prephenate dehydratase 


ectoine/proline uptake protein 




















Q) X ^ 
CO <D 


o 

CN 
CO 


ID 


r*- 

CD 
CO 


o 

CN 
CO 


cn 
m 


CN 

in 

CO 


in 

r- 
CO 


o 
in 


milarity 
(%) 


O 
O 

o 


I00.0 


o 
cz> 
o 


1 00.0 


too.o 


100.0 


100.0 


0*001 


CO 


















Identity 
(%) 


100.0 


100.0 


100.0 


100.0 


100.0 


o 

o 
o 


100.0 


100.0 


Homologous gene 


Corynebacterium glutamicum 
KY10755 ddh 


Corynebacterium glutamicum 
MH20-22B porA 


Corynebacterium glutamicum 
ATCC 13032 ackA 


Corynebacterium glutamicum 
ATCC 13032 pta 


Corynebacterium glutamicum 
ATCC 13032 cmr 


Corynebacterium glutamicum 
ATCC 13032 cIpB 


Corynebacterium glutamicum 
pheA 


Corynebacterium glutamicum 
ATCC 13032 proP 


db Match 


_i 
o 

tr 
O 
o 

i 

X 
Q 
Q 

CL 
CO 


gp:CGL238703J 


sp:ACKA_CORGL 


prf2516394A 


< 

CM 
CM 
CO 

cn 
O 

CO 
CM 

O- 


_» 

O 

C£ 
O 

°. 

LO 
Q_ 

—I 

o 

CL 
cn 


< 

CO 
CO 

s 

CM 
CL 


prf:2501295A 




o 

CD 
O) 


in 

CO 


1191 


CO 

cn 


1377 


2556 


m 

cn 


1512 


Terminal 
(nt) 


2766756 


2887944 


2935315 


2936508 


2962718 


2963606 


3098578 


3272563 


Initial 
(nt) 


2787715 


2888078 


2936505 


I 

2937494 


2961342 


2966161 


3099522 


3274074 


SEQ 

NO. 
(a.a.) 


6994 


6995 


9669 


6997 


6998 


6999 


7000 


7001 


2o| 

W 2 Q 


3494 


3495 


CD 

O) 

. CO 


3497 


3498 


3499 


3500 


3501 
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Determination of effective mutation site 

5 (1) Identification of mutation site based on the comparison of the gene nucleotide sequence of lysine-producing B-6 
strain with that of wild type strain ATCC 13032 

[0374] Corynebacterium glutamicum B-6, which is resistant to S-(2-aminoethyl)cysteine (AEC), rifampicin, strepto- 
mycin and 6-azauracil, is a lysine-producing mutant having been mutated and bred by subjecting the wild type ATCC 

10 13032 strain to multiple rounds of random mutagenesis with a mutagen, N-methyl-N' -nitro-N-nitrosoguanidine (NTG) 
and screening (Appl. Microbiol. Biotechnol., 32. 269-273 (1989)). First, the nucleotide sequences of genes derived 
from the B-6 strain and considered to relate to the lysine production were determined by a method similar to the above. 
The genes relating to the lysine production include lysE and lysG which are lysine-excreting genes; ddh, dapA, horn 
and lysC (encoding diaminopimelate dehydrogenase, dihydropicoiinate synthase, homoserine dehydrogenase and 

is aspartokinase, respectively) which are lysine-biosynthetic genes; and pyc and zwf (encoding pyruvate carboxylase 
and glucose-6-phosphate dehydrogenase, respectively) which are glucose-metabolizing genes. The nucleotide se- 
quences of the genes derived from the production strain were compared with the corresponding nucleotide sequences 
of the ATCC 13032 strain genome represented by SEQ ID NOS:1 to 3501 and analyzed. As a result, mutation points 
were observed in many genes. For example, no mutation site was observed in lysE, lysG, ddh t dapA, and the like, 

20 whereas amino acid replacement mutations were found in horn, lysC, pyc, zwf, and the like. Among these mutation 
points, those which are considered to contribute to the production were extracted on the basis of known biochemical 
or genetic information. Among the mutation points thus extracted, a mutation, Val59Ala, in horn and a mutation, 
Pro458Ser ( in pyc were evaluated whether or not the mutations were effective according to the following method. 

25 (2) Evaluation of mutation, Val59Ala, in horn and mutation, Pro458Ser, in pyc 

[0375] It is known that a mutation in horn inducing requirement or partial requirement for homoserine imparts lysine 
productivity to a wild type strain (Amino Acid Fermentation, ed. by Hiroshi Aida etaL, Japan Scientific Societies Press). 
However, the relationship between the mutation, Val59Ala, in horn and lysine production is not known. It can be ex- 

30 amined whether or not the mutation, Val59Ala, in horn is an effective mutation by introducing the mutation to the wild 
type strain and examining the lysine productivity of the resulting strain. On the other hand, it can be examined whether 
or not the mutation, Pro458Ser, in pyc is effective by introducing this mutation into a lysine-producing strain which has 
a deregulated lysine-bioxynthetic pathway and is free from the pyc mutation, and comparing the lysine productivity of 
the resulting strain with the parent strain. As such a lysine-producing bacterium, No. 58 strain (FERM BP-7134) was 

35 selected (hereinafter referred to the "lysine-producing No. 58 strain" or the "No. 58 strain"). Based on the above, it was 
determined that the mutation, Val59Ala, in horn and the mutation, Pro458Ser, in pyc were introduced into the wild type 
strain of Corynebacterium glutamicum ATCC 13032 (hereinafter referred to as the "wild type ATCC 13032 strain" or 
the "ATCC 13032 strain") and the lysine-producing No. 58 strain, respectively, using the gene replacement method. A 
plasmid vector pCES30 for the gene replacement for the introduction was constructed by the following method. 

40 [0376] A plasmid vector pCE53 having a kanamycin-resistant gene and being capable of autonomously replicating 
in Coryneform bacteria (MoL Gen. Genet, 19&. 175-178 (1984)) and a plasmid pMOB3 (ATCC 77282) containing a 
levansucrase gene (sacB) of Bacillus subtilis (Molecular Microbiology, 6: 1195-1204 (1992)) were each digested with 
Psfl. Then, after agarose gel electrophoresis, a pCE53 fragment and a 2.6 kb DNA fragment containing sacB were 
each extracted and purified using GENECLEAN Kit (manufactured by BIO 101). The pCE53 fragment and the 2.6 kb 

45 DNA fragment were ligated using Ligation Kit ver. 2 (manufactured by Takara Shuzo), introduced into the ATCC 1 3032 
strain by the electroporation method (FEMS Microbiology Letters, 65: 299 (1989)), and cultured on BYG agar medium 
(medium prepared by adding 1 0 g of glucose, 20 g of peptone (manufactured by Kyokuto Pharmaceutical), 5 g of yeast 
extract (manufactured by Difco), and 16 g of Bactoagar (manufactured by Difco) to 1 liter of water, and adjusting its 
pH to 7.2) containing 25 u-g/ml kanamycin at 30°C for 2 days to obtain a transformant acquiring kanamycin-resistance. 

50 As a result of digestion analysis with restriction enzymes, it was confirmed that a plasmid extracted from the resulting 
transformant by the alkali SDS method had a structure in which the 2.6 kb DNA fragment had been inserted into the 
Psfl site of pCE53. This plasmid was named pCES30. 

[0377] Next, two genes having a mutation point, horn and pyc, were amplified by PCR, and inserted into pCES30 
according to the TA cloning method (Bio Experiment Illustrated vol. 3, published by Shujunsha). Specifically, pCES30 
55 was digested with BamH\ (manufactured by Takara Shuzo), subjected to an agarose gel electrophoresis, and extracted 
and purified using GENECLEAN Kit (manufactured by BIO 1 01 ). The both ends of the resulting pCES30 fragment were 
blunted with DNA Blunting Kit (manufactured by Takara Shuzo) according to the attached protocol. The blunt-ended 
pCES30 fragment was concentrated by extraction with phenol/chloroform and precipitation with ethanol, and allowed 
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to react in the presence of Taq po^merase (manufactured by Roche Diagnosticsjrffra dTTP at 70°C for 2 hours so 
that a nucleotide, thymine (T), was added to the 3'-end to pr pare a T vector of pCES30. 

[0378] Separately, chromosomal DNA was prepared from the lysine-producing B-6 strain according to the method 
of Saito et al. (Biochem. Biophys. Acta, 72: 619 (1963)). Using the chromosomal DNA as a template, PCR was carried 

5 out with Pfu turbo DNA polymeiase (manufactured by Stratagene). In the mutated horn gene, the DNAs having the 
nucleotide sequences represented by SEQ ID NOS:7002 and 7003 were used as the primer set. In fh mutated pyc 
gene, the DNAs having the nucleotide sequences represented by SEQ ID NOS:7004 and 7005 were used as the prim r 
set. The resulting PCR product was subjected to agarose gel electrophoresis, and extracted and purified using GENE- 
GLEAN Kit (manufactured by BIO 101). Then, the PCR product was allowed to react in the pr s nee of Taq polymerase 

10 (manufactured by Roche Diagnostics) and dATP at 72°C for 10 minutes so that a nucleotide, adenine (A), was added 
to the 3'-end. 

[0379] The above pCES30 T vector fragment and the mutated horn gene (1 .7 kb) or mutated pyc gene (3.6 kb) to 
which the nucleotide A had been added of the PCR product were concentrated by extraction with phenol/chloroform 
and precipitation with ethanol, and then ligated using Ligation Kit ver. 2. The ligation products were introduced into the 

is ATCC 13032 strain according to the elect ropo ration method, and cultured on BYG agar medium containing 25 u.g/ml 
kanamycin at 30°C for 2 days to obtain kanamycin-resistant transformants. Each of the resulting transformants was 
cultured overnight in BYG liquid medium containing 25 u,g/ml kanamycin, and a plasmid was extracted from the culturing 
solution medium according to the alkali SDS method. As a result of digestion analysis using restriction enzymes, it was 
confirmed that the plasmid had a structure in which the 1 .7 kb or 3.6 kb DNA fragment had been inserted into pCES30. 

20 The plasmids thus constructed were named respectively pChom59 and pCpyc458. 

[0380] The introduction of the mutatibns to the wild type ATCC 13032 strain and the lysine-producing No. 58 strain 
according to the gene replacement method was carried out according to the following method. Specifically, pChom59 
and pCpyc458 were introduced to the ATCC 13032 strain and the No. 58 strain, respectively, and strains in which the 
plasmid is integrated into the chromosomal DNA by homologous recombination were selected using the method of 

25 Ikeda et at. (Microbiology 144: 1863 (1998)). Then, the stains in which the second homologous recombination was 
carried out were selected by a selection method, making use of the fact that the Bacillus subtilis levansucrase encoded 
by pCES30 produced a suicidal substance (J. of Bacterid., 174: 5462 (1992)). Among the selected strains, strains in 
which the wild type horn and pyc genes possessed by the ATCC 1 3032 strain and the No. 58 strain were replaced with 
the mutated horn and pyc genes, respectively, were isolated. The method is specifically explained below. 

30 [0381] One strain was selected from the transformants containing the plasmid, pChom59 or pCpyc458 t and the 
selected strain was cultured in BYG medium containing 20 jig/ml kanamycin, and pCG11 (Japanese Published Exam- 
ined Patent Application No. 91827/94) was introduced thereinto by the electroporation method. pCG11 is a plasmid 
vector having a spectinomycin-resistant gene and a replication origin which is the same as pCE53. After introduction 
of the pCGII, the strain was cultured on BYG agar medium containing 20 u.g/ml kanamycin and 1 00 u,g/ml spectinomycin 

35 at 30°C for 2 days to obtain both the kanamycin- and spectinomycin-resistant transformant. The chromosome of one 
strain of these transformants was examined by the Southern blotting hybridization according to the method reported 
by Ikeda et at. (Microbiology, 144: 1863 (1998)). As a result, it was confirmed that pChom59 or pCpyc458 had been 
integrated into the chromosome by the homologous recombination of the Cambell type. In such a strain, the wild type 
and mutated horn or pyc genes are present closely on the chromosome, and the second homologous recombination 

40 is liable to arise therebetween. 

[0382] Each of these transformants (having been recombined once) was spread on Sue agar medium (medium 
prepared by adding 100 g of sucrose, 7 g of meat extract, 10 g of peptone, 3 g of sodium chloride, 5 g of yeast extract 
(manufactured by Difco), and 18 g of Bactoagar (manufactured by Difco) to 1 liter of water, and adjusting its pH 7.2) 
and cultured at 30°C for a day. Then the colonies thus growing were selected in each case. Since a strain in which the 

45 sacB gene is present converts sucrose into a suicide substrate, it cannot grow in this medium (J. BacterioL, 174: 5462 
(1 992)). On the other hand, a strain in which the sacB gene was deleted due to the second homologous recombination 
between the wild type and the mutated horn or pyc genes positioned closely to each other forms no suicide substrate 
and, therefore, can grow in this medium. In the homologous recombination, either the wild type gene or the mutated 
gene is deleted together with the sacB gene. When the wild type is deleted together with the sacB gene, the gene 

so replacement into the mutated type arises. 

[0383] Chromosomal DNA of each the thus obtained second recombinants was prepared by the above method of 
Saito et al. PCR was carried out using Pfu turbo DNA polymerase (manufactured by Stratagene) and the attached 
buffer. In the horn gene, DNAs having the nucleotide sequences represented by SEQ ID NOS:7002 and 7003 were 
used as the primer set. Also, in the pyc gene was used, DNAs having the nucleotide sequences represented by SEQ 

55 id NOS:7004 and 7005 were used as the primer set. The nucleotide sequences of the PCR products were determined 
by the conventional method so that it was judged whether the horn or pyc gene of the second recombinant was a wild 
type or a mutant. As a result, the second recombinant which were called HD-1 and No. 58pyc were target strains having 
the mutated horn gene and pyc g n , respectiv ly. 
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(3) Lysine production test of HD-rand No. 58pyc strains 



[0384] The HD-1 strain (strain obtained by incorporating the mutation, Val59Ala, in the horn gene into the ATCC 
1 3032 strain) and the No. 58pyc strain (strain obtained by incorporating the mutation, Pro458Ser, in the pyc gene into 

s the lysine-producing No. 58 strain) were subjected to a culture test in a 5 I jar fermenter by using the ATCC 13032 
strain and the lysine-producing No. 58 strain respectively as a control. Thus lysine production was examined. 
. [0385] After culturing on BYG agar medium at 30° C for 24 hours, each strain was inoculated into 250 ml of a seed 
medium (medium prepared by adding 50 g of sucrose, 40 g of corn steep liquor, 8.3 g of ammonium sulfate, 1 g of 
urea, 2 g of potassium dihydrogenphosphate, 0.83 g of magnesium sulfate heptahydrate, 10 mg of iron sulfate hep- 

to tahydrate, 1 mg of copper sulfate pentahydrate, 10mg of zinc sulfate heptahydrate, 10mg of p-alanine, 5 mg of nicotinic 
acid, 1 .5 mg of thiamin hydrochloride, and 0.5 mg of biotin to 1 liter of water, and adjusting its pH to 7.2, then to which 
30 g of calcium carbonate had been added) contained in a 2 1 buffle-attached Erlenmeyer flask and cultured therein 
at 30° C for 12 to 1 6 hours. A total amount of the seed culturing medium was inoculated into 1 ,400 ml of a main culture 
medium (medium prepared by adding 60 g of glucose, 20 g of corn steep liquor, 25 g of ammonium chloride, 2.5 g of 

15 potassium dihydrogenphosphate, 0.75 g of magnesium sulfate heptahydrate, 50 mg of iron sulfate heptahydrate, 13 
mg of manganese sulfate pentahydrate, 50 mg of calcium chloride, 6.3 mg of copper sulfate pentahydrate, 1.3 mg of 
zinc sulfate heptahydrate, 5 mg of nickel chloride hexahydrate, 1.3 mg of cobalt chloride hexahydrate, 1.3 mg of am- 
monium molybdenate tetrahydrate, 14 mg of nicotinic acid, 23 mg of p-alanine, 7 mg of thiamin hydrochloride, and 
0.42 mg of biotin to 1 liter of water) contained in a 5 1 jar fermenter and cultured therein at 32°C, 1 wm and 800 rpm 

20 while controlling the pH to 7.0 with aqueous ammonia. When glucose in the medium had been consumed, a glucose 
feeding solution (medium prepared by adding 400 g glucose and 45 g of ammonium chloride to 1 liter of water) was 
continuously added. The addition of feeding solution was carried out at a controlled speed so as to maintain the dis- 
solved oxygen concentration within a range of 0.5 to 3 ppm. After culturing for 29 hours, the culture was terminated. 
The cells were separated from the culture medium by centrifugation and then L-lysine hydrochloride in the supernatant 

25 was quantified by high performance liquid chromatography (HPLC). The results are shown in Table 2 below. 



Table 2 


Strain 


L-Lysine hydrochloride yield (g/l) 


ATCC 13032 


0 


HD-1 


8 


No. 58 


45 


No. 58pyc 


51 



[0386] As is apparent from the results shown in Table 2, the lysine productivity was improved by introducing the 
mutation, Val59Ala, in the horn gene or the mutation, Pro458Ser, in the pyc gene. Accordingly, it was found that the 
mutations are both effective mutations relating to the production of lysine. Strain, AHP-3, in which the mutation, 
Val59Ala, in the horn gene and the mutation, Pro458Ser, in the pyc gene have been introduced into the wild type ATCC 
13032 strain together with the mutation, Thr331lle in the lysC gene has been deposited on December 5, 2000; in 
National Institute of Bioscience and Human Technology, Agency of Industrial Science and Technology (Higashi 1-1-3, 
Tsukuba-shi, Ibaraki, Japan) as FERM BP-7382. 

Example 3 

Reconstruction of lysine-producing strain based on genome information 

[0387] The lysine-producing mutant B-6 strain (Appl. Microbiol. Biotechnol., 32. 269-273 (1989)), which has been 
constructed by multiple round random mutagenesis with NTG and screening from the wild type ATCC 13032 strain, 
produces a remarkably large amount of lysine hydrochloride when cultured in a jar at 32°C using glucose as a carbon 
source. However, since the fermentation period is long, the production rate is less than 2.1 g/l/h. Breeding to reconstitut 
only effective mutations relating to the production of lysine among the estimated at least 300 mutations introduced into 
the B-6 strain in the wild type ATCC 13032 strain was performed. 

(1) Identification of mutation point and effective mutation by comparing the gene nucleotide sequence of the B-6 strain 
with that of the ATCC 1 3032 strain 



[0388] As described above, the nucleotide sequences of genes derived from the B-6 strain were compared with the 
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of the ATCC 13032 strain genome representee by SEQ ID NOS:1 to 3501 and 
analyzed to identify many mutation points accumulated in the chromosome of the B-6 strain. Among these, a mutation, 
Val591Ala, in hom, a mutation, Thr311lle, in iysC, a mutation, Pro458Ser, in pyc and a mutation, Ala213Thr t in zwf 
were specified as effective mutations relating to the production of lysine. Breeding to reconstitut the 4 mutations in 
5 the wild type strain and for constructing of an industrially important lysine-producing strain was carried out according 
to the method shown below. 

(2) Construction of plasmid for gene replacement having mutated gene 

w [0389] The plasmid for gene replacement, pChom59, having the mutated hom gene and the plasmid for gene re- 
placement, pCpyc458, having the mutated pyc gene were prepared in the above Example 2(2). Piasmids for gene 
replacement having the mutated /ysCand zwrfwere produced as described below. 

[0390] The IysC and zwf having mutation points were amplified by PCR, and inserted into a plasmid for gene re- 
placement, pCES30, according to the TA cloning method described in Example 2(2) (Bio Experiment Illustrated, Vol. 3). 

15 [0391] Separately, chromosomal DNA was prepared from the lysine-producing B-6 strain according to the above 
method of Saito etal. Using the chromosomal DNA as a template, PCR was carried out with Pfu turbo DNA polymerase 
(manufactured by Stratagene). In the mutated IysC gene, the DNAs having the nucleotide sequences represented by 
SEQ ID NOS:7006 and 7007 were used as the primer set. In the mutated zwf gene, the DNAs having the nucleotide 
sequences represented by SEQ ID NOS:7008 and 7009 as the primer set. The resulting PCR product was subjected 

20 to agarose gel electrophoresis, and extracted and purified using GENEGLEAN Kit (manufactured by BIO 101). Then, 
the PCR product was allowed to react in the presence of Taq DNA polymerase (manufactured by Roche Diagnostics) 
and dATP at 72°C for 10 minutes so that a nucleotide, adenine (A), was added to the 3'-end. 

[0392] The above pCES30 T vector fragment and the mutated IysC gene (1 .5 kb) or mutated zwf gene (2.3 kb) to 
which the nucleotide A had been added of the PCR product were concentrated by extraction with phenol/chloroform 

25 and precipitation with ethanol, and then ligated using Ligation Kit ver. 2. The ligation products were introduced into the 
ATCC 13032 strain according to the electroporation method, and cultured on BYG agar medium containing 25 u.g/ml 
kanamycin at 30°C for 2 days to obtain kanamycin-resistant transformants. Each of the resulting transformants was 
cultured overnight in BYG liquid medium containing 25 u,g/ml kanamycin, and a plasmid was extracted from the culturing 
solution medium according to the alkali SDS method. As a result of digestion analysis using restriction enzymes, it was 

30 confirmed that the plasmid had a structure in which the 1 .5 kb or 2.3 kb DNA fragment had been inserted into pCES30. 
The piasmids thus constructed were named respectively pClysC311 and pCzwf213. 

(3) Introduction of mutation, Thr311lle, in IysC into one point mutant HD-1 

35 [0393] Since the one mutation point mutant HD-1 in which the mutation, Val59Ala, in hom was introduced into th 
wild type ATCC 13032 strain had been obtained in Example 2(2), the mutation, Thr311lle, in IysC was introduced into 
the HD-1 strain using pClysC311 produced in the above (2) according to the gene replacement method described in 
Example 2(2). PCR was carried out using chromosomal DNA of the resulting strain and, as the primer set, DNAs having 
the nucleotide sequences represented by SEQ ID NOS:7006 and 7007 in the same manner as in Example 2(2). As a 

40 result of the fact that the nucleotide sequence of the PCR product was determined in the usual manner, it was confirmed 
that the strain which was named AHD-2 was a two point mutant having the mutated lysCgene in addition to the mutated 
hom gene. 

(4) Introduction of mutation, Pro458Ser, in pyc into two point mutant AHD-2 

45 

[0394] The mutation, Pro458Ser t in pyc was introduced into the AHD-2 strain using the pCpyc458 produced in Ex- 
ample 2(2) by the gene replacement method described in Example 2(2). PCR was carried out using chromosomal 
DNA of the resulting strain and, as the primer set, DNAs having the nucleotide sequences represented by SEQ ID 
NOS:7004 and 7005 in the same manner as in Example 2(2). As a result of the fact that the nucleotide sequence of 
so the PCR product was determined in the usual manner, it was confirmed that the strain which was named AHD-3 was 
a three point mutant having the mutated pyc gene in addition to the mutated hom gene and IysC gene. 

(5) Introduction of mutation, Ala213Thr, in zwf into three point mutant AHP-3 

55 [0395] The mutation, Ala213Thr, in zwf was introduced into the AHP-3 strain using the pCzwf458 produced in the 
above (2) by the gene replacement method described in Example 2(2). PCR was carried out using chromosomal DNA 
of the resulting strain and, as the primer set, DNAs having the nucleotide sequences represented by SEQ ID NOS: 
7008 and 7009 in the same manner as in Example 2(2). As a result of th fact that the nucleotide sequence of the PCR 
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product was determined in the usual manner, it was confirmed that the strain which was nam d APZ-4 was a four point 
mutant having the mutated zwf gene in addition to the mutated horn gene, lysC gene and pyc gene. 

(6) Lysine production test on HD-1, AHD-2, AHP-3 and APZ-4 strains 

[0396] The HD-1, AHD-2, AHP-3 and APZ-4 strains obtained above were subjected to a culture test in a 5 I jar 
fermenter in accordance with the method of Example 2(3). 
[0397] Table 3 shows the results. 



Table 3 



Strain 


L-Lysine hydrochloride (g/l) 


Productivity (g/l/h) 


HD-1 


8 


0.3 


AHD-2 


73 


2.5 


AHP-3 


80 


2.8 


APZ-4 


86 


3.0 



[0398] Since the lysine-producing mutant B-6 strain which has been bred based on the random mutation and selection 
shows a productivity of less than 2.1 g/l/h, the APZ-4 strain showing a high productivity of 3.0 g/l/h is useful in industry. 

(7) Lysine fermentation by APZ-4 strain at high temperature 

[0399] The APZ-4 strain, which had been reconstructed by introducing 4 effective mutations into the wild type strain, 
was subjected to the culturing test in a 5 1 jar fermenter in the same manner as in Example 2(3), except that the culturing 
temperature was changed to 40°C. 
[0400] The results are shown in Table 4. 



Table 4 



Temperature (°C) 


L-Lysine hydrochloride (g/l) 


Productivity (g/l/h) 


32 
40 


86 
95 


O CO 

CO CO 



[0401] As is apparent from the results shown in Table 4, the lysine hydrochloride titer and productivity in culturing at 
a high temperature of 40°C comparable to those at 32°C were obtained. In the mutated and bred lysine-producing B- 
6 strain constructed by repeating random mutation and selection, the growth and the lysine productivity are lowered 
at temperatures exceeding 34°C so that lysine fermentation cannot be carried out, whereas lysine fermentation can 
be carried out using the APZ-4 strain at a high temperature of 40°C so that the load of cooling is greatly reduced and 
it is industrially useful. The lysine fermentation at high temperatures can be achieved by reflecting the high temperature 
adaptability inherently possessed by the wild type strain on the APZ-4 strain. 

[0402] As demonstrated in the reconstruction of the lysine-producing strain, the present invention provides a novel 
breeding method effective for eliminating the problems in the conventional mutants and acquiring industrially advan- 
tageous strains. This methodology which reconstitutes the production strain by reconstituting the effective mutation is 
an approach which is efficiently carried out using the nucleotide sequence information of the genome disclosed in the 
present invention, and its effectiveness was found for the first time in the present invention. 

Example 4 

Production of DNA microarray and use thereof 

[0403] A DNA microarray was produced based on the nucleotide sequence information of the ORF deduced from 
the full nucleotide sequences of Corynebacterium glutamicum ATCC 13032 using software, and genes of which ex- 
pression is fluctuated depending on the carbon source during culturing were searched. 

(1) Production of DNA microarray 

[0404] Chromosomal DNA was prepar d from Corynebacterium glutamicum ATCC 1 3032 by the method of Saito et 
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a/. ( Biochem. Biophys. Acta, 72, 619 (1963)). Based on 24 genes having the nucleotide sequences represent d by 
SEQ ID NOS:207 t 3433, 281, 3435, 3439, 765, 3445, 1226, 1229, 3448, 3451, 3453, 3455. 1743, 3470, 2132, 3476, 
3477, 3485, 3488, 3489, 3494, 3496, and 3497 from the ORFs shown in Table 1 deduced from the full genome nucle- 
otide sequence of Corynebacterium glutamicum ATCC 13032 using software and the nucleotide sequence of rabbit 
s globin gene (GenBank Accession No. V00882) used as an internal standard, oligo DNA primers for PCR amplification 
represented by SEQ ID NOS:7010 to 7059 targeting the nucleotide sequences of the genes were synth siz d in a 
usual manner. 

[0405] As the oligo DNA primers used for the PCR, 

[0406] DNAs having the nucleotide sequence represented by SEQ ID NOS:701 0 and 7011 were used for the ampli- 
fy fication of the DNA having the nucleotide sequence represented by SEQ ID NO:207, 

[0407] DNAs having the nucleotide sequence represented by SEQ ID NOS:7012 and 7013 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3433 t 

[0408] DNAs having the nucleotide sequence represented by SEQ ID NOS:7014 and 7015 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:281, 
is [0409] DNAs having the nucleotide sequence represented by SEQ ID NOS:7016 and 7017 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3435, 

[0410] DNAs having the nucleotide sequence represented by SEQ ID NOS:7018 and 7019 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3439 t 

[0411] DNAs having the nucleotide sequence represented by SEQ ID NOS:7020 and 7021 were used for the am- 
20 plification of the DNA having the nucleotide sequence represented by SEQ ID NO:765, 

[0412] DNAs having the nucleotide sequence represented by SEQ ID NOS:7022 and 7023 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3445, 

[0413] DNAs having the nucleotide sequence represented by SEQ ID NOS:7024 and 7025 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO: 1226, 
25 [0414] DNAs having the nucleotide sequence represented by SEQ ID NOS:7026 and 7027 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:1229, 

[0415] DNAs having the nucleotide sequence represented by SEQ ID NOS:7028 and 7029 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3448, 

[0416] DNAs having the nucleotide sequence represented by SEQ ID NOS:7030 and 7031 were used for the am- 
30 plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3451 , 

[0417] DNAs having the nucleotide sequence represented by SEQ ID NOS:7032 and 7033 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3453, 

[0418] DNAs having the nucleotide sequence represented by SEQ ID NOS:7034 and 7035 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3455, 
35 [0419] DNAs having the nucleotide sequence represented by SEQ ID NOS:7036 and 7037 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:1743, 

[0420] DNAs having the nucleotide sequence represented by SEQ ID NOS:7038 and 7039 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3470, 

[0421] DNAs having the nucleotide sequence represented by SEQ ID NOS:7040 and 7041 were used for the am- 
40 plification of the DNA having the nucleotide sequence represented by SEQ ID NO:2132, 

[0422] DNAs having the nucleotide sequence represented by SEQ ID NOS:7042 and 7043 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3476, 

[0423] DNAs having the nucleotide sequence represented by SEQ ID NOS:7044 and 7045 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3477, 
45 [0424] DNAs having the nucleotide sequence represented by SEQ ID NOS:7046 and 7047 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3485, 

[0425] DNAs having the nucleotide sequence represented by SEQ ID NOS.7048 and 7049 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3488, 

[0426] DNAs having the nucleotide sequence represented by SEQ ID NOS:7050 and 7051 were used for the am- 
50 plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3489, 

[0427] DNAs having the nucleotide sequence represented by SEQ ID NOS:7052 and 7053 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3494, 

[0428] DNAs having the nucleotide sequence represented by SEQ ID NOS:7054 and 7055 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3496, 
55 [0429] DNAs having the nucleotide sequence represented by SEQ ID NOS:7056 and 7057 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3497, and 

[0430] DNAs having the nucleotide sequence repr sented by SEQ ID NOS:7058 and 7059 were used for the am- 
plification of the DNA having the nucleotide sequenc of the rabbit globin gen , 
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as the respective primer set. 

[0431 ] The PCR was carried for 30 cycles with each cycle consisting of 1 5 seconds at 95°C and 3 minutes at 68°C 
using a thermal cycler (GeneAmp PCR system 9600, manufactured by Perkin Elmer), TaKaRa EX-Taq (manufactur d 
by Takara Shuzo), 100 ng of the chromosomal DNA and the buffer attached to the TaKaRa Ex-Taq reagent. In the case 

5 of the rabbit globin gene, a single-stranded cDNA which had been synthesized from rabbit globin mRNA (manufactured 
by Life Technologies) according to the manufacture's instructions using a reverse transcriptase RAV-2 (manufactured 
by Takara Shuzo). The PCR product of each gene thus amplified was subjected to agarose gel electrophoresis and 
extracted and purified using QIAquick Gel Extraction Kit (manufactured by QIAGEN). The purified PCR product was 
concentrated by precipitating it with ethanol and adjusted to a concentration of 200 ng/uJ. Each PCR product was 

10 spotted on a slide glass plate (manufactured by Matsunami Glass) having MAS coating in 2 runs using GTMASS 
SYSTEM (manufactured by Nippon Laser & Electronics Lab.) according to the manufacture's instructions. 

(2) Synthesis of fluorescence labeled cDNA 

15 [0432] The ATCC 1 3032 strain was spread on BY agar medium (medium prepared by adding 20 g of peptone (man- 
ufactured by Kyokuto Pharmaceutical), 5 g of yeast extract (manufactured by Difco), and 16 g of Bactoagar (manufac- 
tured by Difco) to in 1 liter of water and adjusting its pH to 7.2) and cultured at 30°C for 2 days. Then, the cultured 
strain was further inoculated into 5 ml of BY liquid medium and cultured at 30°C overnight. Then, the cultured strain 
was further inoculated into 30 ml of a minimum medium (medium prepared by adding 5 g of ammonium sulfate, 5 g of 

20 urea, 0.5 g of monopotassium dihydrogenphosphate, 0.5 g of dipotassium monohydrogenphosphate, 20.9 g of rnor- 
pholinopropanesulfonic acid, 0.25 g of magnesium sulfate heptahydrate, 10 mg of calcium chloride dihydrate, 10 mg 
of manganese sulfate monohydrate, 10 mg of ferrous sulfate heptahydrate, 1 mg of zinc sulfate heptahydrate, 0.2 mg 
copper sulfate, and 0.2 mg biotin to 1 liter of water, and adjusting its pH to 6.5) containing 110 mmol/l glucose or 200 
mmol/l ammonium acetate, and cultured in an Erlenmyer flask at 30° to give 1 .0 of absorbance at 660 nm. After the 

25 cells were prepared by centrifuging at 4°C and 5,000 rprn for 10 minutes, total RNA was prepared from the resulting 
cells according to the method of Bormann etal. ( Molecular Microbiology, &. 317-326 (1992)). To avoid contamination 
with DNA, the RNA was treated with Dnasel (manufactured by Takara Shuzo) at 37°C for 30 minutes and then further 
purified using Qiagen RNeasy MiniKit (manufactured by QIAGEN) according to the manufacture's instructions. To 30 
u.g of the resulting total RNA, 0.6 ^l of rabbit globin mRNA (50 ng/u.l, manufactured by Life Technologies) and 1 \i\ of 

30 a random 6 mer primer (500 ng/uJ, manufactured by Takara Shuzo) were added for denaturing at 65°C for 1 0 minutes, 
followed by quenching on ice. To the resulting solution, 6 fxl of a buffer attached to Superscript II (manufactured by 
Lifetechnologies), 3 uJ of 0.1 mol/l DTT, 1 .5 ^l of dNTPs (25 mmol/l dATP, 25 mmol/l dCTP, 25 mmol/l dGTP, 10 mmol/ 
I dTTP), 1 .5 uJ of Cy5-dUTP or Cy3-dUTP (manufactured by NEN) and 2 uJ of Superscript II were added, and allowed 
to stand at 25°C for 10 minutes and then at 42°C for 110 minutes. The RNA extracted from the cells using glucose as 

35 the carbon source and the RNA extracted from the cells using ammonium acetate were labeled with Cy5-dUTP and 
Cy3-dUTP, respectively. After the fluorescence labeling reaction, the RNA was digested by adding 1 .5 u.l of 1 mol/l 
sodium hydroxide-20 mmol/l EDTA solution and 3.0 fxl of 10% SDS solution, and allowed to stand at 65°C for 10 
minutes. The two cDNA solutions after the labeling were mixed and purified using Qiagen PCR purification Kit (man- 
ufactured by QIAGEN) according to the manufacture's instructions to give a volume of 10 uJ. 

40 

(3) Hybridization 

[0433] UltraHyb (1 1 0 u,l) (manufactured by Ambion) and the fluorescence-labeled cDNA solution (1 0 uJ) were mixed 
and subjected to hybridization and the subsequent washing of slide glass using GeneTAC Hybridization Station (man- 
45 ufactured by Genomic Solutions) according to the manufacture's instructions. The hybridization was carried out at 
50°C, and the washing was carried out at 25° C. 

(4) Fluorescence analysis 

so [0434] The fluorescence amount of each DNA array having the fluorescent cDN A hybridized therewith was measured 
using ScanArray 4000 (manufactured by GSI Lumonics). 

[0435] Table 5 shows the Cy3 and Cy5 signal intensities of the genes having been corrected on the basis of the data 
of the rabbit globin used as the internal standard and the Cy3/Cy5 ratios. 

55 Table 5 



SEQ ID NO 


Cy3 intensity 


Cy5 intensity 


Cy3/Cy5 


207 


5248 


3240 


1.62 
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Table 5 (continued) 



5 



10 



15 



20 



25 



SEQ ID NO 


Cy3 intensity 


Cy5 intensity 


Cy3/Cy5 


3433 


2239 


2694 


0.83 


281 


2370 


2595 


0.91 


3435 


2566 


2515 


1.02 


3439 


5597 


6944 


0.81 


765 


6134 


4943 


1.24 


3455 


1169 


1284 


0.91 


1226 


1301 


1493 


0.87 


1229 


1168 


1131 


1.03 


3448 


1187 


1594 


0.74 


3451 


2845 


3859 


0.74 


3453 


3498 


1705 


2.05 


3455 


1491 


1144 


1.30 


1743 


1972 


1841 


1.07 


3470 


4752 


3764 


1.26 


2132 


1173 


1085 


1.08 


3476 


1847 


1420 


1.30 


3477 


1284 


1164 


1.10 


3485 


4539 


8014 


0.57 


3488 


34289 


1398 


24.52 


3489 


43645 


1497 


29.16 


3494 


3199 


2503 


1.28 


3496 


3428 


2364 


1.45 


3497 


3848 


3358 


1.15 



[0436] The ORF function data estimated by using software were searched for SEQ ID NOS:3488 and 3489 showing 
remarkably strong Cy3 signals. As a result, it was found that SEQ ID NOS:3488 and 3489 are a maleate synthase 
gene and an isocitrate lyase gene, respectively. It is known that these genes are transcriptionally induced by acetic 
acid in Corynebacterium glutamicum (Archives of Microbiology, 168. 262-269 (1997)). 

[0437] As described above, a gene of which expression is fluctuates could be discovered by synthesizing appropriate 
oligo DNA primers based on the ORF nucleotide sequence information deduced from the full genomic nucleotide 
sequence information of Corynebacterium glutamicum ATCC 1 3032 using software, amplifying the nucleotide sequenc- 
s of the gene using the genome DNA of Corynebacterium glutamicum as a template in the PCR reaction, and thus 
producing and using a DNA microarray. 

[0438] This Example shows that the expression amount can be analyzed using a DNA microarray in the 24 genes. 
On the other hand, the present DNA microarray techniques make it possible to prepare DNA microarrays having thereon 
several thousand gene probes at once. Accordingly, it is also possible to prepare DNA microarrays having thereon all 
of the ORF gene probes deduced from the full genomic nucleotide sequence of Corynebacterium glutamicum ATCC 
13032 determined by the present invention, and analyze the expression profile at the total gene level of Corynebac- 
terium glutamicum using these arrays. 

Example 5 



Homology search using Corynebacterium glutamicum genome sequence 
(1) Search of adenosine deaminase 

[0439] The amino acid sequence (ADD_ECOLI) of Escherichia co// adenosine deaminase was obtained from Swiss- 
prot Database as the amino acid sequence of the protein of which function had been confirmed as adenosine deaminase 
(EC3.5.4.4). By using the full length of this amino acid sequence as a query, a homology search was carried out on a 
nucleotide sequence database of the genome sequence of Corynebacterium glutamicum or a database of the amino 
acids in the ORF region deduced from the genome sequence using FASTA program (Proc. Natl. Acad. Sci. ISA, 85: 
2444-2448 (1 988)). A case wher E-value was le* 10 or less was judged as being significantly homologous. As a result, 
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no sequence significantly homologous with the Escherichia coli adenosine deaminas was found in the nucleotide 
sequence database of the genome s quence of Corynebacterium glutamicurn or th database of the amino acid se- 
quences in the ORF region deduced from the genome sequence. Based on these results, it is assumed that Coryne- 
bacterium glutamicurn contains no ORF having adenosine deaminase activity and thus has no activity of converting 
s adenosine into inosine. 

(2) Search of glycine cleavage enzyme 

[0440] The sequences (GCSP_ECOLI, GCST_ECOLI and GCSH_ECOLI) of glycine decarboxylase, aminomethyl 
10 transferase and an aminomethyl group carrier each of which is a component of Escherichia coli glycine cleavage 
enzyme as the amino acid sequence of the protein, of which function had been confirmed as glycine cleavage enzyme 
(EC2.1.2.10), were obtained from Swiss-prot Database. 

[0441 ] By using these full-length amino acid sequences as a query, a homology search was carried out on a nucleotide 
sequence database of the genome sequence of Corynebacterium glutamicurn or a database of the ORF amino acid 

15 sequences deduced from the genome sequence using FASTA program. A case where E-value was le -10 or less was 
judged as being significantly homologous. As a result, no sequence significantly homologous with the glycine decar- 
boxylase, the aminomethyl transferase or the aminomethyl group carrier each of which is a component of Escherichia 
coli glycine cleavage enzyme, was found in the nucleotide sequence database of the genome sequence of Coryne- 
bacterium glutamicurn or the database of the ORF amino acid sequences estimated from the genome sequence. Based 

20 on these results, it is assumed that Corynebacterium glutamicurn contains no ORF having the activity of glycine de- 
carboxylase, aminomethyl transferase or the aminomethyl group carrier and thus has no activity of the glycine cleavage 
enzyme. 

(3) Search of IMP dehydrogenase 

25 

[0442] The amino acid sequence (IMDH ECOLI) of Escherichia coli IMP dehydrogenase as the amino acid sequence 
of the protein, of which function had been confirmed as IMP dehydrogenase (EC1 .1 .1 .205), was obtained from Swiss- 
prot Database. By using the full length of this amino acid sequence as a query, a homology search was carried out on 
a nucleotide sequence database of the genome sequence of Corynebacterium glutamicurn or a database of the ORF 

30 amino acid sequences predicted from the genome sequence using FASTA program. A case where E-value was le -10 
or less was judged as being significantly homologous. As a result, the amino acid sequences encoded by two ORFs, 
namely, an ORF positioned in the region of the nucleotide sequence No. 615336 to 616853 (or ORF having the nucle- 
otide sequence represented by SEQ ID NO:672) and another ORF positioned in the region of the nucleotide sequence 
No. 616973 to 618094 (or ORF having the nucleotide sequence represented by SEQ ID NO:674) were significantly 

35 homologous with the ORFs of Escherichia coli IMP dehydrogenase. By using the above-described predicted amino 
acid sequence as a query in order to examine the similarity of the amino acid sequences encoded by the ORFs with 
IMP dehydrogenases of other organisms in greater detail, a search was carried out on GenBank (http://www.ncbi.nlm. 
nih.gov/) nr-aa database (amino acid sequence database constructed on the basis of GenBankCDS translation prod- 
ucts, PDB database, Swiss-Prot database, PIR database, PRF database by eliminating duplicated registrations) using 
BLAST program. As a result, both of the two amino acid sequences showed significant homologies with IMP dehdy- 
rogenases of other organisms and clearly higher homologies with IMP dehdyrogenases than with amino acid sequences 
of other proteins, and thus, it was assumed that the two ORFs would function as IMP dehydrogenase. Based on these 
results, it was therefore assumed that Corynebacterium glutamicurn has two ORFs having the IMP dehydrogenase 
activity. 

45 

Example 6 

Proteome analysis of proteins derived from Corynebacterium glutamicurn 

so (1 ) Preparations of proteins derived from Corynebacterium glutamicurn ATCC 13032, FERM BP-7134 and FERM BP- 
158 

• [0443] Culturing tests of Corynebacterium glutamicurn ATCC 1 3032 (wild type strain), Corynebacterium glutamicurn 
FERM BP-7134 (lysine-producing strain) and Corynebacterium glutamicurn (FERM BP-158, lysine-highly producing 
55 strain) were carried out in a 5 1 jar fermenter according to the method in Example 2(3). The results are shown in Table 6. 
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Table 6 



5 



Strain 


L-Lysine yield (g/l) 


ATCC 13032 


0 


FERM BP-7134 


45 


FERM BP-158 


60 



[0444] After culturing, cells of each strain were recovered by centrtfugation. These cells wer washed with Tris-HCI 
' o buffer (1 0 rnmol/I Tris-HCI, pH 6.5, 1 .6 mg/ml protease inhibitor (COMPLETE; manufactured by Boehringer Mannheim)) 
three times to give washed cells which could be stored under freezing at -80°C. The freeze-stored cells were thawed 
before use, and used as washed cells. 

[0445] The washed cells described above were suspended in a disruption buffer (1 0 mmol/l Tris-HCI, pH 7.4, 5 mmo!/ 
I magnesium chloride, 50 mg/l RNase, 1.6 mg/ml protease inhibitor (COMPLETE: manufactured by Boehringer Man- 
's nheim)), and disrupted with a disruptor (manufactured by Brown) under cooling. To the resulting disruption solution, 
DNase was added to give a concentration of 50 mg/l, and allowed to stand on ice for 10 minutes. The solution was 
centrifuged (5,000 x g, 15 minutes, 4°C) to remove the undisrupted cells as the precipitate, and the supernatant was 
recovered. 

[0446] To the supernatant, urea was added to give a concentration of 9 mot/I, and an equivalent amount of a lysis 
20 buffer (9.5 mol/l urea, 2% NP-40, 2% Ampholine, 5% mercaptoethanol, 1.6 mg/ml protease inhibitor (COMPLETE; 
manufactured by Boehringer Mannheim) was added thereto, followed by thoroughly stirring at room temperature for 
dissolving. 

[0447] After being dissolved, the solution was centrifuged at 12,000 x g for 15 minutes, and the supernatant was 
recovered. 

25 [0448] To the supernatant, ammonium sulfate was added to the extent of 80% saturation, followed by thoroughly 
stirring for dissolving. 

[0449] After being dissolved, the solution was centrifuged (16,000 x g, 20 minutes, 4°C), and the precipitate was 
recovered. This precipitate was dissolved in the lysis buffer again and used in the subsequent procedures as a protein 
sample. The protein concentration of this sample was determined by the method for quantifying protein of Bradford. 

30 

(2) Separation of protein by two dimensional electrophoresis 

[0450] The first dimensional electrophoresis was carried out as described below by the isoelectric electrophoresis 
method. 

35 [0451] A molded dry IPG strip gel (pH 4-7, 13 cm, Immobiline DryStrips; manufactured by Amersham Pharmacia 
Biotech) was set in an electrophoretic apparatus (Multiphor II or IPGphor; manufactured by Amersham Pharmacia 
Biotech) and a swelling solution (8 mol/l urea, 0.5% Triton X-100, 0.6% dithiothreitol, 0.5% Ampholine, pH 3-10) was 
packed therein, and the gel was allowed to stand for swelling 12 to 16 hours. 

[0452] The protein sample prepared above was dissolved in a sample solution (9 mol/l urea, 2% CHAPS, 1% dithi- 
40 othreitol, 2% Ampholine, pH 3-10), and then about 100 to 500 jig (in terms of protein) portions thereof were taken and 
added to the swollen IPG strip gel. 

[0453] The electrophoresis was carried out in the 4 steps as defined below under controlling the temperature to 20°C: 

step 1 : 1 hour under a gradient mode of 0 to 500V; 
45 step 2: 1 hour under a gradient mode of 500 to 1 ,000 V; 

step 3: 4 hours under a gradient mode of 1 ,000 to 8,000 V; and 
step 4: 1 hour at a constant voltage of 8,000 V. 

[0454] After the isoelectric electrophoresis, the IPG strip gel was put off from the holder and soaked in an equilibration 
50 buffer A (50 mmol/l Tris-HCI, pH 6.8, 30% glycerol, 1% SDS, 0.25% dithiothreitol) for 15 minutes and another equili- 
bration buffer B (50 mmol/l Tris-HCI, pH 6.8, 6 mol/l urea, 30% glycerol, 1 % SDS, 0.45% iodo acetamide) for 1 5 minutes 
to sufficiently equilibrate the gel. 

[0455] After the equilibrium, the IPG strip gel was lightly rinsed in an SDS electrophoresis buffer (1 .4% glycine, 0.1 % 
SDS, 0.3% Tris-HCI, pH 8.5), and the second dimensional electrophoresis depending on molecular weight was carried 
55 out as described below to separate the proteins. 

[0456] Specifically, the above IPG strip gel was closely placed on 1 4% polyacrylamide slub gel (1 4% polyacrylamide, 
0.37% bisacrylamide, 37.5 mmol/l Tris-HCI, pH 8.8, 0.1% SDS, 0.1% TEMED, 0.1% ammonium persulfate) and sub- 



i EP 1 108 790 A2 

jected to electrophoresis under a constant voltage of 30 mA at 20°C for 3 hours to s parate the proteins. 

(3) Detection of protein spot 

5 [0457] Coomassie staining was performed by the method of Gorg et al. (Electrophoresis, 9: 531-546 (1988)) for the 
slub gel after the second dimensional electrophoresis. Specifically, the slub gel was stained under shaking at 25°C for 
about 3 hours, the excessive coloration was remov d with a decoloring solution, and the gel was thoroughly washed 
with distilled water. 

[0458] The results are shown in Fig. 2. The proteins derived from the ATCC 13032 strain (Fig. 2A), FERM BP-7134 
10 strain (Fig. 2B) and FERM BP-158 strain (Fig. 2C) could be separated and detected as spots. 

(4) In-gel digestion of detected protein spot 

[0459] The detected spots were each cut out from the gel and transferred into siliconized tube, and 400 u.l of 100 
is mmol/1 ammonium bicarbonate : acetonitrile solution (1:1, v/v) was added thereto, followed by shaking overnight and 
f reeze-dried as such. To the dried gel, 1 0 ut of a lysylendopeptidase (LysC) solution (manufactured by WAKO, prepared 
with 0.1% SDS-containing 50 mmol/l ammonium bicarbonate to give a concentration of 100 ng/jxl) was added and the 
gel was allowed to stand for swelling at 0°C for 45 minutes, and then allowed to stand at 37°C for 16 hours. After 
removing the LysC solution, 20 u.l of an extracting solution (a mixture of 60% acetonitrile and 5% formic acid) was 
20 added, followed by ultrasonication at room temperature for 5 minutes to disrupt the gel. After the disruption, the extract 
was recovered by centrifugation (12,000 rpm, 5 minutes, room temperature). This operation was repeated twice to 
recover the whole extract. The recovered extract was concentrated by centrifugation in vacuo to halve the liquid volume. 
To the concentrate, 20 u.l of 0.1% trifluoroacetic acid was added, followed by thoroughly stirring, and the mixture was 
subjected to desalting using ZipTip (manufactured by Millipore). The protein absorbed on the carriers of ZipTip was 
25 eluted with 5 uJ of oc-cyano-4-hydroxycinnamic acid for use as a sample solution for analysis. 

(5) Mass spectrometry and amino acid sequence analysis of protein spot with matrix assisted laser desorption ionization 
time of flight mass spectrometer (MALDI-TOFMS) 

30 [0460] The sample solution for analysis was mixed in the equivalent amount with a solution of a peptide mixture for 
mass calibration (300 nmol/l Angiotensin II, 300 nmol/l Neurotensin, 150 nmol/i ACTHclip 18-39, 2.3 u.mol/1 bovine 
insulin B chain), and 1 |xl of the obtained solution was spotted on a stainless probe and crystallized by spontaneously 
drying. 

[0461] As measurement instruments, REFLEX MALDI-TOF mass spectrometer (manufactured by Bruker) and an 
35 N2 laser (337 nm) were used in combination. 

[0462] The analysis by PMF (peptide-mass finger printing) was carried out using integration spectra data obtained 
by measuring 30 times at an accelerated voltage of 19.0 kV and a detector voltage of 1.50 kV under reflector mode 
conditions. Mass calibration was carried out by the internal standard method. 

[0463] The PSD (post-source decay) analysis was carried out using integration spectra obtained by successively 
40 altering the reflection voltage and the detector voltage at an accelerated voltage of 27.5 kV. 

[0464] The masses and amino acid sequences of the peptide fragments derived from the protein spot after digestion 
were thus determined. 

(6) Identification of protein spot 

45 

[0465] From the amino acid sequence information of the digested peptide fragments derived from the protein spot 
obtained in the above (5), ORFs corresponding to the protein were searched on the genome sequence database of 
Corynebacterium glutamicum ATCC 1 3032 as constructed in Example 1 to identify the protein. 
[0466] The identification of the protein was carried out using MS-Fit program and MS-Tag program of intranet protein 
so prospector. 

(a) Search and identification of gene encoding high-expression protein 

[0467] In the proteins derived from Corynebacterium glutamicum ATCC 1 3032 showing high expression amounts in 
55 CBB-staining shown in Fig. 2A, the proteins corresponding to Spots-1 , 2, 3, 4 and 5 were identif ied by the above method. 
[0468] As a result, it was found that Spot-1 corresponded to enolase which was a protein having the amino acid 
sequence of SEQ ID NO:4585; Spot-2 corresponded to phosphoglycelate kinase which was a protein having the amino 
acid sequence of SEQ ID NO:5254; Spot-3 corresponded to glyceraldehyd -3-phosphate dehydrogenas which was 
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a protein having the amino acid sequence represented by SEQ ID NO:5255; Spot-4 correspond d to fructose bis- 
phosphate aldolase which was a protein having th amino acid sequence represented by SEQ ID NO:6543; and Spot- 
5 corresponded to triose phosphate isomerase which was a protein having th amino acid sequence represented by 
SEQ ID NO:5252. 

5 [0469] These genes, represented by SEQ ID NOS:1085, 1754, 1775, 3043 and 1752 encoding the proteins corre- 
sponding to Spots-1, 2, 3, 4 and 5, respectively, encoding the known proteins ar important in the central metabolic 
pathway for maintaining the life of the microorganism. Particularly, it is suggest d that the genes of Spots-2, 3 and 5 
form an operon and a high-expression promoter is encoded in the upstream ther of (J. of Bacterid. , 174: 6067-6086 
(1992)). 

10 [0470] Also, the protein corresponding to Spot-9 in Fig. 2 was identified in the same manner as described above, 
and it was found that Spot-9 was an elongation factor Tu which was a protein having the amino acid sequence repre- 
sented by SEQ ID No:6937, and that the protein was encoded by DNA having the nucleotide sequence represented 
by SEQ ID No: 3437. 

[0471] Based on these results, the proteins having high expression level were identified by proteome analysis using 
is the genome sequence database of Corynebacterium glutamicum constructed in Example 1. Thus, the nucleotide se- 
quences of the genes encoding the proteins and the nucleotide sequences upstream thereof could be searched simul- 
taneously. Accordingly, it is shown that nucleotide sequences having a function as a high-expression promoter can be 
efficiently selected. 

20 (b) Search and identification of modified protein 

[0472] Among the proteins derived from Corynebacterium glutamicum FERM BP-7134 shown in Fig. 2B, Spots-6, 
7 and 8 were identified by the above method. As a result, these three spots all corresponded to catalase which was a 
protein having the amino acid sequence represented by SEQ ID NO:3785. 
25 [0473] Accordingly, all of Spots-6, 7 and 8 detected as spots differing in isoelectric mobility were all products derived 
from a catalase gene having the nucleotide sequence represented by SEQ ID No:285. Accordingly, it is shown that 
the catalase derived from Corynebacterium glutamicum FERM BP-7134 was modified after the translation. 
[0474] Based on these results, it is confirmed that various modified proteins can be efficiently searched by proteome 
analysis using the genome sequence database of Corynebacterium glutamicum constructed in Example 1. 

30 

(c) Search and identification of expressed protein effective in lysine production 

[0475] It was found out that in Fig. 2A (ATCC 13032: wild type strain), Fig. 2B (FERM BP-7134: lysine-producing 
strain) and Fig. 2C (FERM BP-158: lysine-highly producing strain), the catalase corresponding to Spot-8 and the elon- 
35 gation factor Tu corresponding to Spot-9 as identified above showed the higher expression level with an increase in 
the lysine productivity. 

[0476] Based on these results, it was found that hopeful mutated proteins can be efficiently searched and identified 
in breeding aiming at strengthening the productivity of a target product by the proteome analysis using the genome 
sequence database of Corynebacterium glutamicum constructed in Example 1 . 
40 [0477] Moreover, useful mutation points of useful mutants can be easily specified by searching the nucleotide se- 
quences (nucleotide sequences of promoter, ORF, or the like) relating to the identified proteins using the above data- 
base and using primers designed on the basis of the sequences. As a result of the fact that the mutation points are 
specified, industrially useful mutants which have the useful mutations or other useful mutations derived therefrom can 
be easily bred. 

45 [0478] While the invention has been described in detail and with reference to specific embodiments thereof, it will 
be apparent to one of skill in the art that various changes and modifications can be made therein without departing 
from the spirit and scope thereof. All references cited herein are incorporated in their entirety. 



so Claims 

1. A method for at least one of the following: 

(A) identifying a mutation point of a gene derived from a mutant of a coryneform bacterium, 
55 (B) measuring an expression amount of a gene derived from a coryneform bacterium, 

(C) analyzing an expression profile of a gene derived from a coryneform bacterium, 

(D) analyzing expr ssion patterns of genes deriv d from a coryneform bact rium, or 

(E) identifying a gene homologous to a gene derived from a coryneform bacterium, 



said method comprising: 
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(a) producing a polynucleotide array by adhering to a solid support at least two polynucleotides selected 
from the group consisting of first polynucleotides comprising the nucleotide sequence represented by any 
one of SEQ ID NOS:1 to 3501 , second polynucleotides which hybridize with the first polynucieotides under 
stringent conditions, and third polynucleotides comprising a sequence of 10 to 200 continuous bases of 
the first or second polynucleotides, 

(b) incubating the polynucleotid array with at least one of a labeled polynucleotide derived from a co- 
ryneform bacterium, a labeled polynucleotide derived from a mutant of the coryneform bacterium or a 
labeled polynucleotide to be examined, under hybridization conditions, 

(c) detecting any hybridization, and 

(d) analyzing the result of the hybridization. 

2. The method according to claim 1 , wherein the coryneform bacterium is a microorganism belonging to the genus 
Corynebacterium, the genus Brevibacterium, or the genus Microbacterium. 

3. The method according to claim 2, wherein the microorganism belonging to the genus Corynebacterium is selected 
from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophilum, Corynebacterium 
acetoglutamicum, Corynebacterium callunae, Corynebacterium herculis, Corynebacterium iiiium t Corynebacteri- 
um melassecola, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

4. The method according to claim 1 , wherein the polynucleotide derived from a coryneform bacterium, the polynuce- 
lotide derived from a mutant of the coryneform bacterium or the polynucleotide to be examined is a gene relating 
to the biosynthesis of at least one compound selected from an amino acid, a nucleic acid, a vitamin, a saccharide, 
an organic acid, and analogues thereof. 

5. The method according to claim 1 , wherein the polynucleotide to be examined is derived from Escherichia coli. 

6. A polynucleotide array, comprising: 

at least two polynucleotides selected from the group consisting of first polynucleotides comprising the nucle- 
otide sequence represented by any one of SEQ ID NOS:1 to 3501, second polynucleotides which hybridize 
with the first polynucieotides under stringent conditions, and third polynucleotides comprising 10 to 200 con- 
tinuous bases of the first or second polynucleotides, and 
a solid support adhered thereto. 

7. A polynucleotide comprising the nucleotide sequence represented by SEQ ID NO:1 or a polynucleotide having a 
homology of at least 80% with the polynucleotide. 

8. A polynucleotide comprising any one of the nucleotide sequences represented by SEQ ID NOS:2 to 3431, or a 
polynucleotide which hybridizes with the polynucleotide under stringent conditions. 

9. A polynucleotide encoding a polypeptide having any one of the amino acid sequences represented by SEQ ID 
NOS:3502 to 6931 , or a polynucleotide which hybridizes therewith under stringent conditions. 

1 0. A polynucleotide which is present in the 5' upstream or 3' downstream of a polynucleotide comprising the nucleotid 
sequence of any one of SEQ ID NOS:2 to 3431 in a whole polynucleotide comprising the nucleotide sequenc 
represented by SEQ ID NO:1 , and has an activity of regulating an expression of the polynucleotide. 

11. A polynucleotide comprising 10 to 200 continuous bases in the nucleotide sequence of the polynucleotide of any 
one of claims 7 to 1 0, or a polynucleotide comprising a nucleotide sequence complementary to the polynucleotide 
comprising 1 0 to 200 continuous based. 

12. A recombinant DNA comprising the polynucleotide of any one of claims 8 to 11. 

13. A transformant comprising the polynucleotide of any one of claims 8 to 11 or the recombinant DNA of claim 12. 



14. A method for producing a polypeptide, comprising: 
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culturing the transformant of claim 13 in a medium to produce and accumulate a polypeptide encoded by the 
polynucleotide of claim 8 or 9 in the medium, and 
recovering the polypeptide from the medium. 

15. A method for producing at least one of an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and 
analogues thereof, comprising: 

culturing the transformant of claim 13 in a medium to produce and accumulate at least one of an amino acid, 
a nucleic acid, a vitamin, a saccharide, an organic acid, and analogues thereof in the medium, and 
recovering the at least one of the amino acid, the nucleic acid, the vitamin, the saccharide, the organic acid, 
and analogues thereof from the medium. 

16. A polypeptide encoded by a polynucleotide comprising the nucleotide sequence selected from SEQ ID NOS:2 to 
3431. 

17. A polypeptide comprising the amino acid sequence selected from SEQ ID NOS:3502 to 6931. 

18. The polypeptide according to claim 16 or 17, wherein at least one amino acid is deleted, replaced, inserted or 
added, said polypeptides having an activity which is substantially the same as that of the polypeptide without said 
at least one amino acid deletion, replacement, insertion or addition. 

1 9. A polypeptide comprising an amino acid sequence having a homology of at least 60% with the amino acid sequence 
of the polypeptide of claim 1 6 or 1 7, and having an activity which is substantially the same as that of the polypeptide. 

20. An antibody which recognizes the polypeptide of any one of claims 16 to 19. 

21 . A polypeptide array, comprising: 

at least one polypeptide or partial fragment polypeptide selected from the polypeptides of claims 1 6 to 1 9 and 
partial fragment polypeptides of the polypeptides, and 
a solid support adhered thereto. 

22. A polypeptide array, comprising: 

at least one antibody which recognizes a polypeptide or partial fragment polypeptide selected from the polypep- 
tides of claims 16 to 19 and partial fragment polypeptides of the polypeptides, and 
a solid support adhered thereto. 

23. A system based on a computer for identifying a target sequence or a target structure motif derived from a coryne- 
form bacterium, comprising the following: 

(i) a user input device that inputs at least one nucleotide sequence information selected from SEQ ID NOS:1 
to 3501 , and target sequence or target structure motif information; 

(ii) a data storage device for at least temporarily storing the input information; 

(iii) a comparator that compares the at least one nucleotide sequence information selected from SEQ ID NOS: 
1 to 3501 with the target sequence or target structure motif information, recorded by the data storage device 
for screening and analyzing nucleotide sequence information which is coincident with or analogous to the 
target sequence or target structure motif information; and 

(iv) an output device that shows a screening or analyzing result obtained by the comparator. 

24. A method based on a computer for identifying a target sequence or a target structure motif derived from a coryne- 
form bacterium, comprising the following: 

(i) inputting at least one nucleotide sequence information selected from SEQ ID NOS:1 to 3501, target se- 
quence information or target structure motif information into a user input device; 

(ii) at least temporarily storing said information; 

(iii) comparing the at least one nucleotide sequence information selected from SEQ ID NOS:1 to 3501 with 
the target sequence or target structure motif information; and 
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(iv) screening and analyzing nucleotide sequence information which is coincident with or analogous to the 
target sequence or target structure motif information. 

25. A system based on a computer for identifying a target sequence or a target structure motif deriv d from a coryne- 
form bacterium, comprising the following: 

(i) a user input device that inputs at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001 , and target sequence or targ t structure motif information; 

(ii) a data storage device for at least temporarily storing the input information; 

(iii) a comparator that compares the at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001 with the target sequence or target structure motif information, recorded by the data storage 
device for screening and analyzing amino acid sequence information which is coincident with or analogous to 
the target sequence or target structure motif information; and 

(iv) an output device that shows a screening or analyzing result obtained by the comparator. 

26. A method based on a computer for identifying a target sequence or a target structure motif derived from a coryne- 
form bacterium, comprising the following: 

(i) inputting at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001 , and target 
sequence information or target structure motif information into a user input device; 

(ii) at least temporarily storing said information; 

(iii) comparing the at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001 
with the target sequence or target structure motif information; and 

(iv) screening and analyzing amino acid sequence information which is coincident with or analogous to the 
target sequence or target structure motif information. 

27. A system based on a computer for determining a function of a polypeptide encoded by a polynucleotide having a 
target nucleotide sequence derived from a coryneform bacterium, comprising the following: 

(i) a user input device that inputs at least one nucleotide sequence information selected from SEQ ID NOS:2 
to 3501 , function information of a polypeptide encoded by the nucleotide sequence, and target nucleotide 
sequence information; 

(ii) a data storage device for at least temporarily storing the input information; 

(iii) a comparator that compares the at least one nucleotide sequence information selected from SEQ ID NOS: 
2 to 3501 with the target nucleotide sequence information for determining a function of a polypeptide encoded 
by a polynucleotide having the target nucleotide sequence which is coincident with or analogous to the poly- 
nucleotide having at least one nucleotide sequence selected from SEQ ID NOS:2 to 3501; and 

(iv) an output devices that shows a function obtained by the comparator. 

28. A method based on a computer for determining a function of a polypeptide encoded by a polypeptide encoded by 
a polynucleotide having a target nucleotide sequence derived from a coryneform bacterium, comprising the fol- 
lowing: 

(i) inputting at least one nucleotide sequence information selected from SEQ ID NOS:2 to 3501, function in- 
formation of a polypeptide encoded by the nucleotide sequence, and target nucleotide sequence information; 

(ii) at least temporarily storing said information; 

(iii) comparing the at least one nucleotide sequence information selected from SEQ ID NOS:2 to 3501 with 
the target nucleotide sequence information; and 

(iv) determining a function of a polypeptide encoded by a polynucleotide having the target nucleotide sequence 
which is coincident with or analogous to the polynucleotide having at least one nucleotide sequence selected 
from SEQ ID NOS:2 to 3501 . 

29. A system based on a computer for determining a function of a polypeptide having a target amino acid sequence 
derived from a coryneform bacterium, comprising the following: 

(i) a user input device that inputs at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001 , function information based on the amino acid sequence, and target amino acid sequence infor- 
mation; 
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(ii) a data storing device loraX least temporarily storing the input information^ 

(iii) a comparator that compares the at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001 with the target amino acid sequence information for det rmining a function of a polypeptid 
having the target amino acid sequence which is coincident with or analogous to the polypeptide having at least 
one amino acid sequence selected from SEQ ID NOS:3502 to 7001; and 

(iv) an output device that shows a function obtained by the comparator. 

30. A method based on a computer for determining a function of a polypeptide having a target amino acid sequence 
derived from a coryneform bacterium, comprising th following: 

(i) inputting at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001 , function 
information based on the amino acid sequence, and target amino acid sequence information; 

(ii) at least temporarily storing said information; 

(iii) comparing the at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001 
with the target amino acid sequence information; and 

(iv) determining a function of a polypeptide having the target amino acid sequence which is coincident with or 
analogous to the polypeptide having at least one amino acid sequence selected from SEQ ID NOS:3502 to 
7001. 

31 . The system according to any one of claims 23, 25, 27 and 29, wherein a coryneform bacterium is a microorganism 
of the genus Corynebacterium, the genus Brevibacterium, or the genus Microbacterium. 

32. The method according to any one of claims 24, 26, 28 and 30, wherein a coryneform bacterium is a microorganism 
of the genus Corynebacterium, the genus Brevibacterium, or the genus Microbacterium. 

33. The system according to claim 31 , wherein the microorganism belonging to the genus Corynebacterium is selected 
from the group consisting of Corynebacterium giutamicum, Corynebacterium acetoacidophilum, Corynebacterium 
acetogfutamicum, Corynebacterium caliunae, Corynebacterium hercuiis t Corynebacterium iilium, Corynebacteri- 
um melassecola t Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

34. The method according to claim 32, wherein the microorganism belonging to the genus Corynebacterium isselected 
from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophi/um f Corynebacterium 
acetogfutamicum, Corynebacterium catlunae, Corynebacterium herculis, Corynebacterium iiiium, Corynebacteri- 
um melasseco/a, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

35. A recording medium or storage device which is readable by a computer in which at least one nucleotide sequence 
information selected from SEQ ID NOS:1 to 3501 or function information based on the nucleotide sequence is 
recorded, and is usable in the system of claim 23 or 27 or the method of claim 24 or 28. 

36. A recording medium or storage device which is readable by a computer in which at least one amino acid sequence 
information selected from SEQ ID NOS:3502 to 7001 or function information based on the amino acid sequence 
is recorded, and is usable in the system of claim 25 or 29 or the method of claim 26 or 30. 

37. The recording medium or storage device according to claim 35 or 36, which is a computer readable recording 
medium selected from the group consisting of a floppy disc, a hard disc, a magnetic tape, a random access memory 
(RAM), a read only memory (ROM), a magneto-optic disc (MO), CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM 
and DVD-RW. 

38. A polypeptide having a homoserine dehydrogenase activity, comprising an amino acid sequence in which the Val 
residue at the 59th in the amino acid sequence of homoserine dehydrogenase derived from a coryneform bacterium 
is replaced with an amino acid residue other than a Val residue. 

39. A polypeptide comprising an amino acid sequence in which the Val residue at the 59th position in the amino acid 
sequence as represented by SEQ ID NO:6952 is replaced with an amino acid residue other than a Val residue. 



40. The polypeptide according to claim 38 or 39, wherein the Val residue at the 59th position is replaced with an Ala 
residue. 
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a^ferboxylase activity, comprising an amino acid^P^n 



41 . A polypeptide having pyruvaflrorboxylase activity, comprising an amino acid sfquence in which th Pro residue 
at the 458th position in the amino acid sequence of pyruvate carboxylase deriv d from a coryneform bacterium is 
replaced with an amino acid residue other than a Pro residue. 

42. A polypeptide comprising an amino acid sequence in which the Pro residue at the 458th position in the amino acid 
sequence represented by SEQ ID NO:4265 is replaced with an amino acid residue other than a Pro residue. 



43. The polypeptide according to claim 41 or 42, wherein the Pro residue at the 458th position is replac d with a S r 
residue. 



44. The polypeptide according to any one of claims 38 to 43, which is derived from Corynebacterium glutamicum. 

45. A DNA encoding the polypeptide of any one of claims 38 to 44. 



46. A recombinant DNA comprising the DNA of claim 45. 

47. A transformant comprising the recombinant DNA of claim 46. 

48. A transformant comprising in its chromosome the DNA of claim 45. 



49. The transformant according to claim 47 or 48, which is derived from a coryneform bacterium. 

50. The transformant according to claim 49 t which is derived from Corynebacterium glutamicum. 

51. A method for producing L-lysine, comprising: 

culturing the transformant of any one of claims 47 to 50 in a medium to produce and accumulate L-lysine in 
the medium, and 

recovering the L-lysine from the culture. 

52. A method for breeding a coryneform bacterium using the nucleotide sequence information represented by SEQ 
ID NOS:1 to 3431, comprising the following: 

(i) comparing a nucleotide sequence of a genome or gene of a production strain derived a coryneform bacte- 
rium which has been subjected to mutation breeding so as to produce at least one compound selected from 
an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof by a fermentation 
method, with a corresponding nucleotide sequence in SEQ ID NOS:1 to 3431; 

(ii) identifying a mutation point present in the production strain based on a result obtained by (i); 

(iii) introducing the mutation point into a coryneform bacterium which is free of the mutation point, or deleting 
the mutation point from a coryneform bacterium having the mutation point; and 

(iv) examining productivity by the fermentation method of the compound selected in (i) of the coryneform 
bacterium obtained in (iii). 

53. The method according to claim 52, wherein the gene is a gene encoding an enzyme in a biosynthetic pathway or 
a signal transmission pathway. 

54. The method according to claim 52, wherein the mutation point is a mutation point relating to a useful mutation 
which improves or stabilizes the productivity. 

55. A method for breading a coryneform bacterium using the nucleotide sequence information represented by SEQ 
ID NOS:1 to 3431, comprising: 



(i) comparing a nucleotide sequence of a genome or gene of a production strain derived a coryneform bacte- 
rium which has been subjected to mutation breeding so as to produce at least one compound selected from 
an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof by a fermentation 
method, with a corresponding nucleotide sequence in SEQ ID NOS:1 to 3431 ; 

(ii) identifying a mutation point present in the production strain based on a result obtain by (i); 

(iii) deleting a mutation point from a coryneform bacterium having the mutation point; and 
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(iv) examining productivity^ the fermentation method of the compound selected in (i) of th coryneform 
bacterium obtained in (iii). 

56. The method according to claim 55, wherein the gene is a gene encoding an enzym in a biosynthetic pathway or 
5 a signal transmission pathway. 

57. The method according to claim 55, wherein the mutation point is a mutation point which decreases or destabilizes 
the productivity. 

10 58. A method for breeding a coryneform bacterium using the nucleotide sequence information represented by SEQ 
ID NOS:2 to 3431, comprising the following: 



(i) identifying an isozyme relating to biosynthesis of at least one compound selected from an amino acid, a 
nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof, based on the nucleotide se- 

15 quence information represented by SEQ ID NOS:2 to 3431; 

(ii) classifying the isozyme identified in (i) into an isozyme having the same activity; 

(iii) mutating all genes encoding the isozyme having the same activity simultaneously; and 

(iv) examining productivity by a fermentation method of the compound selected in (i) of the coryneform bac- 
terium which have been transformed with the gene obtained in (iii). 

20 

59. A method for breeding a coryneform bacterium using the nucleotide sequence information represented by SEQ 
ID NOS:2 to 3431, comprising the following: 



(i) arranging a function information of an open reading frame (ORF) represented by SEQ ID NOS:2 to 3431; 
25 (ii) allowing the arranged ORF to correspond to an enzyme on a known biosynthesis or signal transmission 

pathway; 

(iii) explicating an unknown biosynthesis pathway or signal transmission pathway of a coryneform bacterium 
in combination with information relating known biosynthesis pathway or signal transmission pathway of a co- 
ryneform bacterium; 

30 (iv) comparing the pathway explicated in (iii) with a biosynthesis pathway of a target useful product; and 

(v) transgenetically varying a coryneform bacterium based on the nucleotide sequence information to either 
strengthen a pathway which is judged to be important in the biosynthesis of the target useful product in (iv) or 
weaken a pathway which is judged not to be important in the biosynthesis of the target useful product in (iv). 



35 60. A coryneform bacterium, bred by the method of any one of claims 52 to 59. 



61 . The coryneform bacterium according to claim 60, which is a microorganism belonging to the genus Corynebacte- 
rium, the genus Brevibacterium, or the genus Microbacterium. 

to 62. The coryneform bacterium according to claim 61 , wherein the microorganism belonging to the genus Corynebac- 
terium is selected from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophilum, 
Corynebacterium acetoglutamicum, Corynebacterium callunae, Corynebacterium herculis t corynebacterium /il- 
ium, Corynebacterium meiassecola, Corynebacterium thermoamino genes, and Corynebacterium ammonia 
genes. 

45 

63. A method for producing at least one compound selected from an amino acid, a nucleic acid, a vitamin, a saccharide, 
an organic acid and an analogue thereof, comprising: 

culturing a coryneform bacterium of any one of claims 60 to 62 in a medium to produce and accumulate at 
50 least one compound selected from an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, 

and analogues thereof; 
recovering the compound from the culture. 

64. The method according to claim 63, wherein the compound is L-lysine. 

55 

65. A method for identifying a protein relating to useful mutation based on proteome analysis, comprising the following: 



(i) preparing 
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a prot in derived from a bacterium of a production strain of a coryneform bacterium which has been sub- 
jected to mutation breeding by a fermentation process so as to produce at least one compound selected 
from an amino acid, a nucleic acid, a vitamin, a saccharid , an organic acid, and analogues thereof, and 
a protein derived from a bacterium of a parent strain of the production strain; 

(ii) separating the proteins prepared in (i) by two dimensional electrophoresis; 

(iii) detecting the separated proteins, and comparing an expression amount of the protein derived from the 
production strain with that derived from the parent strain; 

(iv) treating the protein showing different expression amounts as a result of the comparison with a peptidase 
to extract peptide fragments; 

(v) analyzing amino acid sequences of the peptide fragments obtained in (iv); and 

(vi) comparing the amino acid sequences obtained in (v) with the amino acid sequence represented by SEQ 
ID NOS:3502 to 7001 to identifying the protein having the amino acid sequences. 

66. The method according to claim 65, wherein the coryneform bacterium is a microorganism belonging to the genus 
corynebacterium, the genus Brevibacterium, or the genus Microbacterium. 

67. The method according to claim 66, wherein the microorganism belonging to the genus Corynebacterium is selected 
from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophilum, Corynebacterium 
acetoglutamicum, Corynebacterium cailunae, Corynebacterium herculis, Corynebacterium lilium, Corynebacteri- 
um meiassecola, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

68. A biologically pure culture of Corynebacterium glutamicum AHP-3 (FERM BP-7382) . 
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